Overview

Dataset statistics

Number of variables38
Number of observations5417
Missing cells11602
Missing cells (%)5.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 MiB
Average record size in memory304.0 B

Variable types

Categorical31
Numeric7

Alerts

ocorrencia_cidade has a high cardinality: 1068 distinct values High cardinality
ocorrencia_aerodromo has a high cardinality: 512 distinct values High cardinality
ocorrencia_dia has a high cardinality: 2684 distinct values High cardinality
ocorrencia_hora has a high cardinality: 923 distinct values High cardinality
ocorrencia_localizacao has a high cardinality: 2695 distinct values High cardinality
ocorrencia_DT has a high cardinality: 5144 distinct values High cardinality
ocorrencia_tipo has a high cardinality: 80 distinct values High cardinality
ocorrencia_tipo_categoria has a high cardinality: 79 distinct values High cardinality
aeronave_matricula has a high cardinality: 3901 distinct values High cardinality
aeronave_fabricante has a high cardinality: 233 distinct values High cardinality
aeronave_modelo has a high cardinality: 737 distinct values High cardinality
aeronave_tipo_icao has a high cardinality: 229 distinct values High cardinality
aeronave_voo_origem has a high cardinality: 677 distinct values High cardinality
aeronave_voo_destino has a high cardinality: 674 distinct values High cardinality
aeronave_pmd is highly correlated with aeronave_pmd_categoria and 1 other fieldsHigh correlation
aeronave_pmd_categoria is highly correlated with aeronave_pmd and 1 other fieldsHigh correlation
aeronave_assentos is highly correlated with aeronave_pmd and 1 other fieldsHigh correlation
aeronave_pmd is highly correlated with aeronave_pmd_categoria and 1 other fieldsHigh correlation
aeronave_pmd_categoria is highly correlated with aeronave_pmd and 1 other fieldsHigh correlation
aeronave_assentos is highly correlated with aeronave_pmd and 1 other fieldsHigh correlation
aeronave_pmd is highly correlated with aeronave_pmd_categoria and 1 other fieldsHigh correlation
aeronave_pmd_categoria is highly correlated with aeronave_pmd and 1 other fieldsHigh correlation
aeronave_assentos is highly correlated with aeronave_pmd and 1 other fieldsHigh correlation
aeronave_motor_tipo is highly correlated with aeronave_tipo_veiculo and 2 other fieldsHigh correlation
ocorrencia_tipo_categoria is highly correlated with aeronave_nivel_dano and 5 other fieldsHigh correlation
aeronave_registro_segmento is highly correlated with aeronave_tipo_operacao and 1 other fieldsHigh correlation
aeronave_nivel_dano is highly correlated with ocorrencia_tipo_categoria and 3 other fieldsHigh correlation
aeronave_tipo_veiculo is highly correlated with aeronave_motor_tipo and 1 other fieldsHigh correlation
taxonomia_tipo_icao is highly correlated with ocorrencia_tipo_categoria and 2 other fieldsHigh correlation
aeronave_tipo_operacao is highly correlated with aeronave_registro_segmento and 1 other fieldsHigh correlation
ocorrencia_tipo is highly correlated with ocorrencia_tipo_categoria and 5 other fieldsHigh correlation
divulgacao_relatorio_publicado is highly correlated with aeronave_nivel_dano and 1 other fieldsHigh correlation
aeronave_operador_categoria is highly correlated with aeronave_registro_segmento and 4 other fieldsHigh correlation
investigacao_status is highly correlated with aeronave_operador_categoriaHigh correlation
aeronave_registro_categoria is highly correlated with aeronave_motor_tipo and 1 other fieldsHigh correlation
ocorrencia_saida_pista is highly correlated with ocorrencia_tipo_categoria and 2 other fieldsHigh correlation
investigacao_aeronave_liberada is highly correlated with aeronave_operador_categoriaHigh correlation
ocorrencia_classificacao is highly correlated with ocorrencia_tipo_categoria and 4 other fieldsHigh correlation
aeronave_motor_quantidade is highly correlated with aeronave_motor_tipoHigh correlation
total_aeronaves_envolvidas is highly correlated with ocorrencia_tipo_categoria and 1 other fieldsHigh correlation
ocorrencia_classificacao is highly correlated with ocorrencia_tipo and 7 other fieldsHigh correlation
ocorrencia_uf is highly correlated with aeronave_tipo_operacaoHigh correlation
investigacao_status is highly correlated with aeronave_nivel_danoHigh correlation
divulgacao_relatorio_publicado is highly correlated with ocorrencia_tipo and 3 other fieldsHigh correlation
total_aeronaves_envolvidas is highly correlated with ocorrencia_tipo and 2 other fieldsHigh correlation
ocorrencia_saida_pista is highly correlated with ocorrencia_tipo and 2 other fieldsHigh correlation
ocorrencia_tipo is highly correlated with ocorrencia_classificacao and 14 other fieldsHigh correlation
ocorrencia_tipo_categoria is highly correlated with ocorrencia_classificacao and 14 other fieldsHigh correlation
taxonomia_tipo_icao is highly correlated with ocorrencia_classificacao and 7 other fieldsHigh correlation
aeronave_operador_categoria is highly correlated with ocorrencia_classificacao and 13 other fieldsHigh correlation
aeronave_tipo_veiculo is highly correlated with ocorrencia_tipo and 6 other fieldsHigh correlation
aeronave_motor_tipo is highly correlated with ocorrencia_tipo and 11 other fieldsHigh correlation
aeronave_motor_quantidade is highly correlated with ocorrencia_tipo and 8 other fieldsHigh correlation
aeronave_pmd is highly correlated with aeronave_operador_categoria and 6 other fieldsHigh correlation
aeronave_pmd_categoria is highly correlated with aeronave_operador_categoria and 6 other fieldsHigh correlation
aeronave_assentos is highly correlated with aeronave_operador_categoria and 6 other fieldsHigh correlation
aeronave_ano_fabricacao is highly correlated with aeronave_motor_tipoHigh correlation
aeronave_registro_categoria is highly correlated with ocorrencia_tipo and 6 other fieldsHigh correlation
aeronave_registro_segmento is highly correlated with ocorrencia_classificacao and 13 other fieldsHigh correlation
aeronave_fase_operacao is highly correlated with ocorrencia_classificacao and 7 other fieldsHigh correlation
aeronave_tipo_operacao is highly correlated with ocorrencia_classificacao and 13 other fieldsHigh correlation
aeronave_nivel_dano is highly correlated with ocorrencia_classificacao and 9 other fieldsHigh correlation
aeronave_fatalidades_total is highly correlated with aeronave_nivel_danoHigh correlation
ocorrencia_aerodromo has 1986 (36.7%) missing values Missing
investigacao_aeronave_liberada has 1811 (33.4%) missing values Missing
investigacao_status has 261 (4.8%) missing values Missing
ocorrencia_localizacao has 1524 (28.1%) missing values Missing
aeronave_operador_categoria has 3128 (57.7%) missing values Missing
aeronave_tipo_veiculo has 159 (2.9%) missing values Missing
aeronave_fabricante has 358 (6.6%) missing values Missing
aeronave_modelo has 175 (3.2%) missing values Missing
aeronave_tipo_icao has 271 (5.0%) missing values Missing
aeronave_motor_tipo has 237 (4.4%) missing values Missing
aeronave_motor_quantidade has 94 (1.7%) missing values Missing
aeronave_assentos has 199 (3.7%) missing values Missing
aeronave_ano_fabricacao has 520 (9.6%) missing values Missing
aeronave_registro_categoria has 159 (2.9%) missing values Missing
aeronave_registro_segmento has 73 (1.3%) missing values Missing
aeronave_voo_origem has 204 (3.8%) missing values Missing
aeronave_voo_destino has 196 (3.6%) missing values Missing
aeronave_tipo_operacao has 148 (2.7%) missing values Missing
ocorrencia_dia is uniformly distributed Uniform
ocorrencia_DT is uniformly distributed Uniform
aeronave_matricula is uniformly distributed Uniform
total_recomendacoes has 4779 (88.2%) zeros Zeros
aeronave_pmd has 241 (4.4%) zeros Zeros
aeronave_pmd_categoria has 241 (4.4%) zeros Zeros
aeronave_assentos has 258 (4.8%) zeros Zeros
aeronave_fatalidades_total has 5015 (92.6%) zeros Zeros

Reproduction

Analysis started2022-05-29 04:55:03.469638
Analysis finished2022-05-29 04:55:19.030134
Duration15.56 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

ocorrencia_classificacao
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
INCIDENTE
2863 
ACIDENTE
1791 
INCIDENTE GRAVE
763 

Length

Max length15
Median length9
Mean length9.514491416
Min length8

Characters and Unicode

Total characters51540
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowINCIDENTE
2nd rowACIDENTE
3rd rowACIDENTE
4th rowACIDENTE
5th rowACIDENTE

Common Values

ValueCountFrequency (%)
INCIDENTE2863
52.9%
ACIDENTE1791
33.1%
INCIDENTE GRAVE763
 
14.1%

Length

2022-05-29T01:55:19.099592image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-29T01:55:19.208205image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
incidente3626
58.7%
acidente1791
29.0%
grave763
 
12.3%

Most occurring characters

ValueCountFrequency (%)
E11597
22.5%
I9043
17.5%
N9043
17.5%
C5417
10.5%
D5417
10.5%
T5417
10.5%
A2554
 
5.0%
763
 
1.5%
G763
 
1.5%
R763
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter50777
98.5%
Space Separator763
 
1.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E11597
22.8%
I9043
17.8%
N9043
17.8%
C5417
10.7%
D5417
10.7%
T5417
10.7%
A2554
 
5.0%
G763
 
1.5%
R763
 
1.5%
V763
 
1.5%
Space Separator
ValueCountFrequency (%)
763
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin50777
98.5%
Common763
 
1.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
E11597
22.8%
I9043
17.8%
N9043
17.8%
C5417
10.7%
D5417
10.7%
T5417
10.7%
A2554
 
5.0%
G763
 
1.5%
R763
 
1.5%
V763
 
1.5%
Common
ValueCountFrequency (%)
763
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII51540
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E11597
22.5%
I9043
17.5%
N9043
17.5%
C5417
10.5%
D5417
10.5%
T5417
10.5%
A2554
 
5.0%
763
 
1.5%
G763
 
1.5%
R763
 
1.5%

ocorrencia_cidade
Categorical

HIGH CARDINALITY

Distinct1068
Distinct (%)19.7%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
SÃO PAULO - SP
 
257
RIO DE JANEIRO - RJ
 
256
BELO HORIZONTE - MG
 
174
CAMPINAS - SP
 
170
GUARULHOS - SP
 
130
Other values (1063)
4430 

Length

Max length37
Median length30
Mean length15.20509507
Min length8

Characters and Unicode

Total characters82366
Distinct characters40
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique548 ?
Unique (%)10.1%

Sample

1st rowPORTO ALEGRE - RS
2nd rowGUARULHOS - SP
3rd rowVIAMÃO - RS
4th rowSÃO SEBASTIÃO - SP
5th rowSÃO SEPÉ - RS

Common Values

ValueCountFrequency (%)
SÃO PAULO - SP257
 
4.7%
RIO DE JANEIRO - RJ256
 
4.7%
BELO HORIZONTE - MG174
 
3.2%
CAMPINAS - SP170
 
3.1%
GUARULHOS - SP130
 
2.4%
GOIÂNIA - GO125
 
2.3%
BRASÍLIA - DF113
 
2.1%
LONDRINA - PR112
 
2.1%
MANAUS - AM95
 
1.8%
PORTO ALEGRE - RS91
 
1.7%
Other values (1058)3894
71.9%

Length

2022-05-29T01:55:19.300030image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
5419
27.6%
sp1332
 
6.8%
mg510
 
2.6%
são468
 
2.4%
pr452
 
2.3%
rj414
 
2.1%
rio359
 
1.8%
de351
 
1.8%
rs342
 
1.7%
mt297
 
1.5%
Other values (1090)9667
49.3%

Most occurring characters

ValueCountFrequency (%)
14194
17.2%
A8506
 
10.3%
O6023
 
7.3%
R5855
 
7.1%
-5438
 
6.6%
S5111
 
6.2%
I4519
 
5.5%
P3921
 
4.8%
E3542
 
4.3%
N3026
 
3.7%
Other values (30)22231
27.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter62717
76.1%
Space Separator14194
 
17.2%
Dash Punctuation5438
 
6.6%
Other Punctuation17
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A8506
13.6%
O6023
 
9.6%
R5855
 
9.3%
S5111
 
8.1%
I4519
 
7.2%
P3921
 
6.3%
E3542
 
5.6%
N3026
 
4.8%
M2547
 
4.1%
T2293
 
3.7%
Other values (26)17374
27.7%
Other Punctuation
ValueCountFrequency (%)
'11
64.7%
*6
35.3%
Space Separator
ValueCountFrequency (%)
14194
100.0%
Dash Punctuation
ValueCountFrequency (%)
-5438
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin62717
76.1%
Common19649
 
23.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A8506
13.6%
O6023
 
9.6%
R5855
 
9.3%
S5111
 
8.1%
I4519
 
7.2%
P3921
 
6.3%
E3542
 
5.6%
N3026
 
4.8%
M2547
 
4.1%
T2293
 
3.7%
Other values (26)17374
27.7%
Common
ValueCountFrequency (%)
14194
72.2%
-5438
 
27.7%
'11
 
0.1%
*6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII79919
97.0%
None2447
 
3.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
14194
17.8%
A8506
10.6%
O6023
 
7.5%
R5855
 
7.3%
-5438
 
6.8%
S5111
 
6.4%
I4519
 
5.7%
P3921
 
4.9%
E3542
 
4.4%
N3026
 
3.8%
Other values (19)19784
24.8%
None
ValueCountFrequency (%)
Ã709
29.0%
É368
15.0%
Í338
13.8%
Á334
13.6%
Â223
 
9.1%
Ó220
 
9.0%
Ç184
 
7.5%
Ê30
 
1.2%
Ú18
 
0.7%
Ô17
 
0.7%

ocorrencia_uf
Categorical

HIGH CORRELATION

Distinct27
Distinct (%)0.5%
Missing2
Missing (%)< 0.1%
Memory size42.4 KiB
SP
1332 
MG
510 
PR
452 
RJ
414 
RS
342 
Other values (22)
2365 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters10830
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRS
2nd rowSP
3rd rowRS
4th rowSP
5th rowRS

Common Values

ValueCountFrequency (%)
SP1332
24.6%
MG510
 
9.4%
PR452
 
8.3%
RJ414
 
7.6%
RS342
 
6.3%
MT297
 
5.5%
GO285
 
5.3%
PA269
 
5.0%
AM212
 
3.9%
BA201
 
3.7%
Other values (17)1101
20.3%

Length

2022-05-29T01:55:19.399307image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sp1332
24.6%
mg510
 
9.4%
pr452
 
8.3%
rj414
 
7.6%
rs342
 
6.3%
mt297
 
5.5%
go285
 
5.3%
pa269
 
5.0%
am212
 
3.9%
ba201
 
3.7%
Other values (17)1101
20.3%

Most occurring characters

ValueCountFrequency (%)
P2219
20.5%
S2109
19.5%
R1383
12.8%
M1247
11.5%
A833
 
7.7%
G795
 
7.3%
J414
 
3.8%
O379
 
3.5%
T345
 
3.2%
C308
 
2.8%
Other values (7)798
 
7.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter10830
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P2219
20.5%
S2109
19.5%
R1383
12.8%
M1247
11.5%
A833
 
7.7%
G795
 
7.3%
J414
 
3.8%
O379
 
3.5%
T345
 
3.2%
C308
 
2.8%
Other values (7)798
 
7.4%

Most occurring scripts

ValueCountFrequency (%)
Latin10830
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P2219
20.5%
S2109
19.5%
R1383
12.8%
M1247
11.5%
A833
 
7.7%
G795
 
7.3%
J414
 
3.8%
O379
 
3.5%
T345
 
3.2%
C308
 
2.8%
Other values (7)798
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII10830
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P2219
20.5%
S2109
19.5%
R1383
12.8%
M1247
11.5%
A833
 
7.7%
G795
 
7.3%
J414
 
3.8%
O379
 
3.5%
T345
 
3.2%
C308
 
2.8%
Other values (7)798
 
7.4%

ocorrencia_aerodromo
Categorical

HIGH CARDINALITY
MISSING

Distinct512
Distinct (%)14.9%
Missing1986
Missing (%)36.7%
Memory size42.4 KiB
SBGR
 
126
SBMT
 
121
SBBH
 
106
SBKP
 
98
SBLO
 
87
Other values (507)
2893 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters13724
Distinct characters28
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique248 ?
Unique (%)7.2%

Sample

1st rowSBPA
2nd rowSBGR
3rd rowSDAI
4th rowSBBE
5th rowSBUL

Common Values

ValueCountFrequency (%)
SBGR126
 
2.3%
SBMT121
 
2.2%
SBBH106
 
2.0%
SBKP98
 
1.8%
SBLO87
 
1.6%
SBGL84
 
1.6%
SBBR80
 
1.5%
SBJD78
 
1.4%
SBSP70
 
1.3%
SBPA68
 
1.3%
Other values (502)2513
46.4%
(Missing)1986
36.7%

Length

2022-05-29T01:55:19.477032image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sbgr126
 
3.7%
sbmt121
 
3.5%
sbbh106
 
3.1%
sbkp98
 
2.9%
sblo87
 
2.5%
sbgl84
 
2.4%
sbbr80
 
2.3%
sbjd78
 
2.3%
sbsp70
 
2.0%
sbpa68
 
2.0%
Other values (502)2513
73.2%

Most occurring characters

ValueCountFrequency (%)
S3911
28.5%
B2985
21.8%
R569
 
4.1%
P568
 
4.1%
G514
 
3.7%
D449
 
3.3%
J383
 
2.8%
L370
 
2.7%
N359
 
2.6%
C348
 
2.5%
Other values (18)3268
23.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter13715
99.9%
Decimal Number9
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S3911
28.5%
B2985
21.8%
R569
 
4.1%
P568
 
4.1%
G514
 
3.7%
D449
 
3.3%
J383
 
2.8%
L370
 
2.7%
N359
 
2.6%
C348
 
2.5%
Other values (16)3259
23.8%
Decimal Number
ValueCountFrequency (%)
98
88.9%
51
 
11.1%

Most occurring scripts

ValueCountFrequency (%)
Latin13715
99.9%
Common9
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
S3911
28.5%
B2985
21.8%
R569
 
4.1%
P568
 
4.1%
G514
 
3.7%
D449
 
3.3%
J383
 
2.8%
L370
 
2.7%
N359
 
2.6%
C348
 
2.5%
Other values (16)3259
23.8%
Common
ValueCountFrequency (%)
98
88.9%
51
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII13724
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S3911
28.5%
B2985
21.8%
R569
 
4.1%
P568
 
4.1%
G514
 
3.7%
D449
 
3.3%
J383
 
2.8%
L370
 
2.7%
N359
 
2.6%
C348
 
2.5%
Other values (18)3268
23.8%

ocorrencia_dia
Categorical

HIGH CARDINALITY
UNIFORM

Distinct2684
Distinct (%)49.5%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
22/02/2017
 
9
18/12/2013
 
9
05/11/2021
 
8
09/02/2014
 
7
07/08/2019
 
7
Other values (2679)
5377 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters54170
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1096 ?
Unique (%)20.2%

Sample

1st row05/01/2012
2nd row06/01/2012
3rd row06/01/2012
4th row06/01/2012
5th row06/01/2012

Common Values

ValueCountFrequency (%)
22/02/20179
 
0.2%
18/12/20139
 
0.2%
05/11/20218
 
0.1%
09/02/20147
 
0.1%
07/08/20197
 
0.1%
13/05/20217
 
0.1%
11/10/20136
 
0.1%
17/02/20126
 
0.1%
03/05/20136
 
0.1%
08/05/20136
 
0.1%
Other values (2674)5346
98.7%

Length

2022-05-29T01:55:19.555245image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
22/02/20179
 
0.2%
18/12/20139
 
0.2%
05/11/20218
 
0.1%
09/02/20147
 
0.1%
07/08/20197
 
0.1%
13/05/20217
 
0.1%
16/10/20206
 
0.1%
12/02/20146
 
0.1%
09/04/20146
 
0.1%
12/04/20196
 
0.1%
Other values (2674)5346
98.7%

Most occurring characters

ValueCountFrequency (%)
012627
23.3%
/10834
20.0%
210476
19.3%
19543
17.6%
32019
 
3.7%
91514
 
2.8%
41508
 
2.8%
81457
 
2.7%
51428
 
2.6%
71406
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number43336
80.0%
Other Punctuation10834
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
012627
29.1%
210476
24.2%
19543
22.0%
32019
 
4.7%
91514
 
3.5%
41508
 
3.5%
81457
 
3.4%
51428
 
3.3%
71406
 
3.2%
61358
 
3.1%
Other Punctuation
ValueCountFrequency (%)
/10834
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common54170
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
012627
23.3%
/10834
20.0%
210476
19.3%
19543
17.6%
32019
 
3.7%
91514
 
2.8%
41508
 
2.8%
81457
 
2.7%
51428
 
2.6%
71406
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII54170
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
012627
23.3%
/10834
20.0%
210476
19.3%
19543
17.6%
32019
 
3.7%
91514
 
2.8%
41508
 
2.8%
81457
 
2.7%
51428
 
2.6%
71406
 
2.6%

ocorrencia_hora
Categorical

HIGH CARDINALITY

Distinct923
Distinct (%)17.0%
Missing1
Missing (%)< 0.1%
Memory size42.4 KiB
20:00:00
 
115
19:00:00
 
90
18:00:00
 
86
13:00:00
 
86
13:30:00
 
84
Other values (918)
4955 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters43328
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique322 ?
Unique (%)5.9%

Sample

1st row20:27:00
2nd row13:44:00
3rd row13:00:00
4th row17:00:00
5th row16:30:00

Common Values

ValueCountFrequency (%)
20:00:00115
 
2.1%
19:00:0090
 
1.7%
18:00:0086
 
1.6%
13:00:0086
 
1.6%
13:30:0084
 
1.6%
12:00:0082
 
1.5%
20:30:0081
 
1.5%
19:30:0077
 
1.4%
14:00:0076
 
1.4%
15:00:0076
 
1.4%
Other values (913)4563
84.2%

Length

2022-05-29T01:55:19.633130image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
20:00:00115
 
2.1%
19:00:0090
 
1.7%
18:00:0086
 
1.6%
13:00:0086
 
1.6%
13:30:0084
 
1.6%
12:00:0082
 
1.5%
20:30:0081
 
1.5%
19:30:0077
 
1.4%
14:00:0076
 
1.4%
15:00:0076
 
1.4%
Other values (913)4563
84.3%

Most occurring characters

ValueCountFrequency (%)
016375
37.8%
:10832
25.0%
15502
 
12.7%
22488
 
5.7%
52061
 
4.8%
32012
 
4.6%
41445
 
3.3%
9750
 
1.7%
8707
 
1.6%
7620
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number32496
75.0%
Other Punctuation10832
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
016375
50.4%
15502
 
16.9%
22488
 
7.7%
52061
 
6.3%
32012
 
6.2%
41445
 
4.4%
9750
 
2.3%
8707
 
2.2%
7620
 
1.9%
6536
 
1.6%
Other Punctuation
ValueCountFrequency (%)
:10832
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common43328
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
016375
37.8%
:10832
25.0%
15502
 
12.7%
22488
 
5.7%
52061
 
4.8%
32012
 
4.6%
41445
 
3.3%
9750
 
1.7%
8707
 
1.6%
7620
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII43328
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
016375
37.8%
:10832
25.0%
15502
 
12.7%
22488
 
5.7%
52061
 
4.8%
32012
 
4.6%
41445
 
3.3%
9750
 
1.7%
8707
 
1.6%
7620
 
1.4%

investigacao_aeronave_liberada
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)0.1%
Missing1811
Missing (%)33.4%
Memory size42.4 KiB
SIM
3569 
NÃO
 
37

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters10818
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSIM
2nd rowSIM
3rd rowSIM
4th rowSIM
5th rowSIM

Common Values

ValueCountFrequency (%)
SIM3569
65.9%
NÃO37
 
0.7%
(Missing)1811
33.4%

Length

2022-05-29T01:55:19.717266image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-29T01:55:19.808115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
sim3569
99.0%
não37
 
1.0%

Most occurring characters

ValueCountFrequency (%)
S3569
33.0%
I3569
33.0%
M3569
33.0%
N37
 
0.3%
Ã37
 
0.3%
O37
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter10818
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S3569
33.0%
I3569
33.0%
M3569
33.0%
N37
 
0.3%
Ã37
 
0.3%
O37
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin10818
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S3569
33.0%
I3569
33.0%
M3569
33.0%
N37
 
0.3%
Ã37
 
0.3%
O37
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII10781
99.7%
None37
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S3569
33.1%
I3569
33.1%
M3569
33.1%
N37
 
0.3%
O37
 
0.3%
None
ValueCountFrequency (%)
Ã37
100.0%

investigacao_status
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing261
Missing (%)4.8%
Memory size42.4 KiB
FINALIZADA
4662 
ATIVA
494 

Length

Max length10
Median length10
Mean length9.52094647
Min length5

Characters and Unicode

Total characters49090
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFINALIZADA
2nd rowFINALIZADA
3rd rowFINALIZADA
4th rowFINALIZADA
5th rowFINALIZADA

Common Values

ValueCountFrequency (%)
FINALIZADA4662
86.1%
ATIVA494
 
9.1%
(Missing)261
 
4.8%

Length

2022-05-29T01:55:19.878407image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-29T01:55:19.948075image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
finalizada4662
90.4%
ativa494
 
9.6%

Most occurring characters

ValueCountFrequency (%)
A14974
30.5%
I9818
20.0%
F4662
 
9.5%
N4662
 
9.5%
L4662
 
9.5%
Z4662
 
9.5%
D4662
 
9.5%
T494
 
1.0%
V494
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter49090
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A14974
30.5%
I9818
20.0%
F4662
 
9.5%
N4662
 
9.5%
L4662
 
9.5%
Z4662
 
9.5%
D4662
 
9.5%
T494
 
1.0%
V494
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Latin49090
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A14974
30.5%
I9818
20.0%
F4662
 
9.5%
N4662
 
9.5%
L4662
 
9.5%
Z4662
 
9.5%
D4662
 
9.5%
T494
 
1.0%
V494
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII49090
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A14974
30.5%
I9818
20.0%
F4662
 
9.5%
N4662
 
9.5%
L4662
 
9.5%
Z4662
 
9.5%
D4662
 
9.5%
T494
 
1.0%
V494
 
1.0%

divulgacao_relatorio_publicado
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
NÃO
3979 
SIM
1438 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters16251
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNÃO
2nd rowSIM
3rd rowSIM
4th rowNÃO
5th rowSIM

Common Values

ValueCountFrequency (%)
NÃO3979
73.5%
SIM1438
 
26.5%

Length

2022-05-29T01:55:20.027370image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-29T01:55:20.124575image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
não3979
73.5%
sim1438
 
26.5%

Most occurring characters

ValueCountFrequency (%)
N3979
24.5%
Ã3979
24.5%
O3979
24.5%
S1438
 
8.8%
I1438
 
8.8%
M1438
 
8.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter16251
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N3979
24.5%
Ã3979
24.5%
O3979
24.5%
S1438
 
8.8%
I1438
 
8.8%
M1438
 
8.8%

Most occurring scripts

ValueCountFrequency (%)
Latin16251
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N3979
24.5%
Ã3979
24.5%
O3979
24.5%
S1438
 
8.8%
I1438
 
8.8%
M1438
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII12272
75.5%
None3979
 
24.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N3979
32.4%
O3979
32.4%
S1438
 
11.7%
I1438
 
11.7%
M1438
 
11.7%
None
ValueCountFrequency (%)
Ã3979
100.0%

total_recomendacoes
Real number (ℝ≥0)

ZEROS

Distinct13
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2396160236
Minimum0
Maximum13
Zeros4779
Zeros (%)88.2%
Negative0
Negative (%)0.0%
Memory size42.4 KiB
2022-05-29T01:55:20.184959image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum13
Range13
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8562674489
Coefficient of variation (CV)3.573498283
Kurtosis47.42642397
Mean0.2396160236
Median Absolute Deviation (MAD)0
Skewness5.804398978
Sum1298
Variance0.7331939441
MonotonicityNot monotonic
2022-05-29T01:55:20.258489image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
04779
88.2%
1316
 
5.8%
2175
 
3.2%
370
 
1.3%
439
 
0.7%
611
 
0.2%
59
 
0.2%
77
 
0.1%
96
 
0.1%
82
 
< 0.1%
Other values (3)3
 
0.1%
ValueCountFrequency (%)
04779
88.2%
1316
 
5.8%
2175
 
3.2%
370
 
1.3%
439
 
0.7%
59
 
0.2%
611
 
0.2%
77
 
0.1%
82
 
< 0.1%
96
 
0.1%
ValueCountFrequency (%)
131
 
< 0.1%
121
 
< 0.1%
111
 
< 0.1%
96
 
0.1%
82
 
< 0.1%
77
 
0.1%
611
 
0.2%
59
 
0.2%
439
0.7%
370
1.3%

total_aeronaves_envolvidas
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
1
5280 
2
 
128
3
 
9

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters5417
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
15280
97.5%
2128
 
2.4%
39
 
0.2%

Length

2022-05-29T01:55:20.346200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-29T01:55:20.435168image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
15280
97.5%
2128
 
2.4%
39
 
0.2%

Most occurring characters

ValueCountFrequency (%)
15280
97.5%
2128
 
2.4%
39
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5417
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
15280
97.5%
2128
 
2.4%
39
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common5417
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
15280
97.5%
2128
 
2.4%
39
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII5417
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
15280
97.5%
2128
 
2.4%
39
 
0.2%

ocorrencia_saida_pista
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
NÃO
4809 
SIM
608 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters16251
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNÃO
2nd rowNÃO
3rd rowNÃO
4th rowNÃO
5th rowNÃO

Common Values

ValueCountFrequency (%)
NÃO4809
88.8%
SIM608
 
11.2%

Length

2022-05-29T01:55:20.500766image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-29T01:55:20.573678image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
não4809
88.8%
sim608
 
11.2%

Most occurring characters

ValueCountFrequency (%)
N4809
29.6%
Ã4809
29.6%
O4809
29.6%
S608
 
3.7%
I608
 
3.7%
M608
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter16251
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N4809
29.6%
Ã4809
29.6%
O4809
29.6%
S608
 
3.7%
I608
 
3.7%
M608
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Latin16251
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N4809
29.6%
Ã4809
29.6%
O4809
29.6%
S608
 
3.7%
I608
 
3.7%
M608
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII11442
70.4%
None4809
29.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N4809
42.0%
O4809
42.0%
S608
 
5.3%
I608
 
5.3%
M608
 
5.3%
None
ValueCountFrequency (%)
Ã4809
100.0%

ocorrencia_localizacao
Categorical

HIGH CARDINALITY
MISSING

Distinct2695
Distinct (%)69.2%
Missing1524
Missing (%)28.1%
Memory size42.4 KiB
\t-23.33027778\t / \t-51.13666667\t
 
39
-23.00694444444 / -47.13444444444
 
33
\t-29.99388889\t / \t-51.17111111\t
 
27
-23.0069444444 / -47.1344444444
 
22
\t-25.53166667\t / \t-49.17611111\t
 
20
Other values (2690)
3752 

Length

Max length35
Median length33
Mean length28.38196763
Min length9

Characters and Unicode

Total characters110491
Distinct characters41
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2269 ?
Unique (%)58.3%

Sample

1st row-23.4355555556 / -46.4730555556
2nd row-19.9133333333 / -48.2930555556
3rd row-5.3677777778 / -49.1383333333
4th row-24.5888888889 / -48.2113888889
5th row-16.0388888889 / -57.5958333333

Common Values

ValueCountFrequency (%)
\t-23.33027778\t / \t-51.13666667\t39
 
0.7%
-23.00694444444 / -47.1344444444433
 
0.6%
\t-29.99388889\t / \t-51.17111111\t27
 
0.5%
-23.0069444444 / -47.134444444422
 
0.4%
\t-25.53166667\t / \t-49.17611111\t20
 
0.4%
-23.1816666667 / -46.943611111118
 
0.3%
-23.6261111111 / -46.656388888917
 
0.3%
-12.9086111111 / -38.322517
 
0.3%
\t-25.40333333\t / \t-49.23361111\t17
 
0.3%
-19.8519444444 / -43.950555555616
 
0.3%
Other values (2685)3667
67.7%
(Missing)1524
28.1%

Length

2022-05-29T01:55:20.664836image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3896
33.3%
22.8145
 
0.4%
t-23.33027778\t41
 
0.4%
t-51.13666667\t39
 
0.3%
23.0069444444436
 
0.3%
47.1344444444435
 
0.3%
t-29.99388889\t30
 
0.3%
t-51.17111111\t29
 
0.2%
47.134444444426
 
0.2%
23.006944444423
 
0.2%
Other values (5150)7509
64.1%

Most occurring characters

ValueCountFrequency (%)
410356
9.4%
.9476
 
8.6%
19252
 
8.4%
28860
 
8.0%
58720
 
7.9%
68446
 
7.6%
38174
 
7.4%
7816
 
7.1%
-7663
 
6.9%
87634
 
6.9%
Other values (31)24094
21.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number76632
69.4%
Other Punctuation15827
 
14.3%
Space Separator7816
 
7.1%
Dash Punctuation7663
 
6.9%
Lowercase Letter2428
 
2.2%
Other Symbol73
 
0.1%
Uppercase Letter44
 
< 0.1%
Final Punctuation8
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S9
20.5%
W9
20.5%
O6
13.6%
N4
9.1%
Ã2
 
4.5%
A2
 
4.5%
M2
 
4.5%
R2
 
4.5%
F2
 
4.5%
I2
 
4.5%
Other values (2)4
9.1%
Decimal Number
ValueCountFrequency (%)
410356
13.5%
19252
12.1%
28860
11.6%
58720
11.4%
68446
11.0%
38174
10.7%
87634
10.0%
77246
9.5%
94094
 
5.3%
03850
 
5.0%
Lowercase Letter
ValueCountFrequency (%)
t2414
99.4%
o2
 
0.1%
n2
 
0.1%
e2
 
0.1%
d2
 
0.1%
u2
 
0.1%
i2
 
0.1%
g2
 
0.1%
Other Punctuation
ValueCountFrequency (%)
.9476
59.9%
/3893
24.6%
\2412
 
15.2%
,38
 
0.2%
*6
 
< 0.1%
:2
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
4
50.0%
4
50.0%
Space Separator
ValueCountFrequency (%)
7816
100.0%
Dash Punctuation
ValueCountFrequency (%)
-7663
100.0%
Other Symbol
ValueCountFrequency (%)
°73
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common108019
97.8%
Latin2472
 
2.2%

Most frequent character per script

Common
ValueCountFrequency (%)
410356
9.6%
.9476
8.8%
19252
8.6%
28860
8.2%
58720
8.1%
68446
 
7.8%
38174
 
7.6%
7816
 
7.2%
-7663
 
7.1%
87634
 
7.1%
Other values (11)21622
20.0%
Latin
ValueCountFrequency (%)
t2414
97.7%
S9
 
0.4%
W9
 
0.4%
O6
 
0.2%
N4
 
0.2%
o2
 
0.1%
Ã2
 
0.1%
A2
 
0.1%
M2
 
0.1%
R2
 
0.1%
Other values (10)20
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII110408
99.9%
None75
 
0.1%
Punctuation8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
410356
9.4%
.9476
 
8.6%
19252
 
8.4%
28860
 
8.0%
58720
 
7.9%
68446
 
7.6%
38174
 
7.4%
7816
 
7.1%
-7663
 
6.9%
87634
 
6.9%
Other values (27)24011
21.7%
None
ValueCountFrequency (%)
°73
97.3%
Ã2
 
2.7%
Punctuation
ValueCountFrequency (%)
4
50.0%
4
50.0%

ocorrencia_DT
Categorical

HIGH CARDINALITY
UNIFORM

Distinct5144
Distinct (%)95.0%
Missing1
Missing (%)< 0.1%
Memory size42.4 KiB
2021-12-07 19:10:00
 
4
2021-05-11 17:10:00
 
4
2020-12-07 16:14:00
 
3
2020-10-16 14:30:00
 
3
2019-07-28 20:00:00
 
3
Other values (5139)
5399 

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters102904
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4888 ?
Unique (%)90.3%

Sample

1st row2012-05-01 20:27:00
2nd row2012-06-01 13:44:00
3rd row2012-06-01 13:00:00
4th row2012-06-01 17:00:00
5th row2012-06-01 16:30:00

Common Values

ValueCountFrequency (%)
2021-12-07 19:10:004
 
0.1%
2021-05-11 17:10:004
 
0.1%
2020-12-07 16:14:003
 
0.1%
2020-10-16 14:30:003
 
0.1%
2019-07-28 20:00:003
 
0.1%
2018-11-08 20:15:003
 
0.1%
2021-05-13 11:02:003
 
0.1%
2019-07-09 20:00:003
 
0.1%
2012-02-17 23:00:003
 
0.1%
2019-08-18 18:43:003
 
0.1%
Other values (5134)5384
99.4%

Length

2022-05-29T01:55:20.765200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
20:00:00115
 
1.1%
19:00:0090
 
0.8%
18:00:0086
 
0.8%
13:00:0086
 
0.8%
13:30:0084
 
0.8%
12:00:0082
 
0.8%
20:30:0081
 
0.7%
19:30:0077
 
0.7%
14:00:0076
 
0.7%
15:00:0076
 
0.7%
Other values (3597)9979
92.1%

Most occurring characters

ValueCountFrequency (%)
028999
28.2%
115044
14.6%
212963
12.6%
-10832
 
10.5%
:10832
 
10.5%
5416
 
5.3%
34031
 
3.9%
53489
 
3.4%
42953
 
2.9%
92263
 
2.2%
Other values (3)6082
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number75824
73.7%
Dash Punctuation10832
 
10.5%
Other Punctuation10832
 
10.5%
Space Separator5416
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
028999
38.2%
115044
19.8%
212963
17.1%
34031
 
5.3%
53489
 
4.6%
42953
 
3.9%
92263
 
3.0%
82164
 
2.9%
72024
 
2.7%
61894
 
2.5%
Dash Punctuation
ValueCountFrequency (%)
-10832
100.0%
Other Punctuation
ValueCountFrequency (%)
:10832
100.0%
Space Separator
ValueCountFrequency (%)
5416
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common102904
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
028999
28.2%
115044
14.6%
212963
12.6%
-10832
 
10.5%
:10832
 
10.5%
5416
 
5.3%
34031
 
3.9%
53489
 
3.4%
42953
 
2.9%
92263
 
2.2%
Other values (3)6082
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII102904
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
028999
28.2%
115044
14.6%
212963
12.6%
-10832
 
10.5%
:10832
 
10.5%
5416
 
5.3%
34031
 
3.9%
53489
 
3.4%
42953
 
2.9%
92263
 
2.2%
Other values (3)6082
 
5.9%

ocorrencia_mes
Real number (ℝ≥0)

Distinct12
Distinct (%)0.2%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean6.463626292
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size42.4 KiB
2022-05-29T01:55:20.854722image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.475513457
Coefficient of variation (CV)0.5377033417
Kurtosis-1.242413432
Mean6.463626292
Median Absolute Deviation (MAD)3
Skewness-0.0005830875398
Sum35007
Variance12.07919379
MonotonicityNot monotonic
2022-05-29T01:55:20.928379image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
3526
9.7%
1486
9.0%
9479
8.8%
11453
8.4%
10453
8.4%
8452
8.3%
12437
8.1%
6431
8.0%
7431
8.0%
4427
7.9%
Other values (2)841
15.5%
ValueCountFrequency (%)
1486
9.0%
2419
7.7%
3526
9.7%
4427
7.9%
5422
7.8%
6431
8.0%
7431
8.0%
8452
8.3%
9479
8.8%
10453
8.4%
ValueCountFrequency (%)
12437
8.1%
11453
8.4%
10453
8.4%
9479
8.8%
8452
8.3%
7431
8.0%
6431
8.0%
5422
7.8%
4427
7.9%
3526
9.7%

ocorrencia_tipo
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct80
Distinct (%)1.5%
Missing1
Missing (%)< 0.1%
Memory size42.4 KiB
FALHA DO MOTOR EM VOO
669 
FALHA OU MAU FUNCIONAMENTO DE SISTEMA / COMPONENTE
596 
ESTOURO DE PNEU
578 
PERDA DE CONTROLE NO SOLO
386 
PERDA DE CONTROLE EM VOO
327 
Other values (75)
2860 

Length

Max length92
Median length42
Mean length23.84268833
Min length6

Characters and Unicode

Total characters129132
Distinct characters41
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st rowESTOURO DE PNEU
2nd rowCOM PESSOAL EM VOO
3rd rowFALHA DO MOTOR EM VOO
4th rowFALHA DO MOTOR EM VOO
5th rowPERDA DE CONTROLE NO SOLO

Common Values

ValueCountFrequency (%)
FALHA DO MOTOR EM VOO669
 
12.4%
FALHA OU MAU FUNCIONAMENTO DE SISTEMA / COMPONENTE596
 
11.0%
ESTOURO DE PNEU578
 
10.7%
PERDA DE CONTROLE NO SOLO386
 
7.1%
PERDA DE CONTROLE EM VOO327
 
6.0%
COM TREM DE POUSO314
 
5.8%
OUTROS246
 
4.5%
COLISÃO COM AVE199
 
3.7%
EXCURSÃO DE PISTA188
 
3.5%
COLISÃO COM OBSTÁCULO DURANTE A DECOLAGEM E POUSO173
 
3.2%
Other values (70)1740
32.1%

Length

2022-05-29T01:55:21.011208image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
de2662
 
11.3%
falha1329
 
5.6%
em1279
 
5.4%
voo1212
 
5.1%
com995
 
4.2%
pouso827
 
3.5%
789
 
3.3%
perda788
 
3.3%
motor731
 
3.1%
do731
 
3.1%
Other values (125)12292
52.0%

Most occurring characters

ValueCountFrequency (%)
O20336
15.7%
18219
14.1%
E12928
10.0%
A8892
 
6.9%
M6812
 
5.3%
N6776
 
5.2%
T6468
 
5.0%
S5938
 
4.6%
R5463
 
4.2%
U5376
 
4.2%
Other values (31)31924
24.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter109878
85.1%
Space Separator18219
 
14.1%
Other Punctuation922
 
0.7%
Dash Punctuation67
 
0.1%
Open Punctuation23
 
< 0.1%
Close Punctuation23
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O20336
18.5%
E12928
11.8%
A8892
 
8.1%
M6812
 
6.2%
N6776
 
6.2%
T6468
 
5.9%
S5938
 
5.4%
R5463
 
5.0%
U5376
 
4.9%
D5211
 
4.7%
Other values (24)25678
23.4%
Other Punctuation
ValueCountFrequency (%)
/797
86.4%
.117
 
12.7%
,8
 
0.9%
Space Separator
ValueCountFrequency (%)
18219
100.0%
Dash Punctuation
ValueCountFrequency (%)
-67
100.0%
Open Punctuation
ValueCountFrequency (%)
(23
100.0%
Close Punctuation
ValueCountFrequency (%)
)23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin109878
85.1%
Common19254
 
14.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
O20336
18.5%
E12928
11.8%
A8892
 
8.1%
M6812
 
6.2%
N6776
 
6.2%
T6468
 
5.9%
S5938
 
5.4%
R5463
 
5.0%
U5376
 
4.9%
D5211
 
4.7%
Other values (24)25678
23.4%
Common
ValueCountFrequency (%)
18219
94.6%
/797
 
4.1%
.117
 
0.6%
-67
 
0.3%
(23
 
0.1%
)23
 
0.1%
,8
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII127099
98.4%
None2033
 
1.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O20336
16.0%
18219
14.3%
E12928
10.2%
A8892
 
7.0%
M6812
 
5.4%
N6776
 
5.3%
T6468
 
5.1%
S5938
 
4.7%
R5463
 
4.3%
U5376
 
4.2%
Other values (21)29891
23.5%
None
ValueCountFrequency (%)
Ã1058
52.0%
Á382
 
18.8%
Ç180
 
8.9%
É134
 
6.6%
Ó114
 
5.6%
Ô96
 
4.7%
Í34
 
1.7%
Õ15
 
0.7%
Ê12
 
0.6%
Â8
 
0.4%

ocorrencia_tipo_categoria
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct79
Distinct (%)1.5%
Missing3
Missing (%)0.1%
Memory size42.4 KiB
FALHA OU MAU FUNCIONAMENTO DO MOTOR | FALHA DO MOTOR EM VOO
669 
FALHA OU MAU FUNCIONAMENTO DE SISTEMA / COMPONENTE
596 
FALHA OU MAU FUNCIONAMENTO DE SISTEMA / COMPONENTE | ESTOURO DE PNEU
578 
PERDA DE CONTROLE NO SOLO
386 
PERDA DE CONTROLE EM VOO
327 
Other values (74)
2858 

Length

Max length96
Median length80
Mean length43.68821574
Min length6

Characters and Unicode

Total characters236528
Distinct characters42
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st rowFALHA OU MAU FUNCIONAMENTO DE SISTEMA / COMPONENTE | ESTOURO DE PNEU
2nd rowOUTROS | COM PESSOAL EM VOO
3rd rowFALHA OU MAU FUNCIONAMENTO DO MOTOR | FALHA DO MOTOR EM VOO
4th rowFALHA OU MAU FUNCIONAMENTO DO MOTOR | FALHA DO MOTOR EM VOO
5th rowPERDA DE CONTROLE NO SOLO

Common Values

ValueCountFrequency (%)
FALHA OU MAU FUNCIONAMENTO DO MOTOR | FALHA DO MOTOR EM VOO669
 
12.4%
FALHA OU MAU FUNCIONAMENTO DE SISTEMA / COMPONENTE596
 
11.0%
FALHA OU MAU FUNCIONAMENTO DE SISTEMA / COMPONENTE | ESTOURO DE PNEU578
 
10.7%
PERDA DE CONTROLE NO SOLO386
 
7.1%
PERDA DE CONTROLE EM VOO327
 
6.0%
FALHA OU MAU FUNCIONAMENTO DE SISTEMA / COMPONENTE | COM TREM DE POUSO314
 
5.8%
OUTROS246
 
4.5%
COLISÃO COM AVE199
 
3.7%
EXCURSÃO DE PISTA188
 
3.5%
COLISÃO COM OBSTÁCULO DURANTE A DECOLAGEM E POUSO173
 
3.2%
Other values (69)1738
32.1%

Length

2022-05-29T01:55:21.130561image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
4904
 
11.4%
de3976
 
9.3%
falha3197
 
7.5%
ou2515
 
5.9%
mau2493
 
5.8%
funcionamento2493
 
5.8%
componente1791
 
4.2%
sistema1723
 
4.0%
do1472
 
3.4%
motor1472
 
3.4%
Other values (124)16850
39.3%

Most occurring characters

ValueCountFrequency (%)
37472
15.8%
O33341
14.1%
E20068
 
8.5%
A19009
 
8.0%
N15097
 
6.4%
M14283
 
6.0%
T12419
 
5.3%
U11559
 
4.9%
S9485
 
4.0%
C8858
 
3.7%
Other values (32)54937
23.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter193743
81.9%
Space Separator37472
 
15.8%
Math Symbol2860
 
1.2%
Other Punctuation2242
 
0.9%
Open Punctuation72
 
< 0.1%
Close Punctuation72
 
< 0.1%
Dash Punctuation67
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O33341
17.2%
E20068
10.4%
A19009
9.8%
N15097
 
7.8%
M14283
 
7.4%
T12419
 
6.4%
U11559
 
6.0%
S9485
 
4.9%
C8858
 
4.6%
D7410
 
3.8%
Other values (24)42214
21.8%
Other Punctuation
ValueCountFrequency (%)
/2117
94.4%
.117
 
5.2%
,8
 
0.4%
Space Separator
ValueCountFrequency (%)
37472
100.0%
Math Symbol
ValueCountFrequency (%)
|2860
100.0%
Open Punctuation
ValueCountFrequency (%)
(72
100.0%
Close Punctuation
ValueCountFrequency (%)
)72
100.0%
Dash Punctuation
ValueCountFrequency (%)
-67
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin193743
81.9%
Common42785
 
18.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
O33341
17.2%
E20068
10.4%
A19009
9.8%
N15097
 
7.8%
M14283
 
7.4%
T12419
 
6.4%
U11559
 
6.0%
S9485
 
4.9%
C8858
 
4.6%
D7410
 
3.8%
Other values (24)42214
21.8%
Common
ValueCountFrequency (%)
37472
87.6%
|2860
 
6.7%
/2117
 
4.9%
.117
 
0.3%
(72
 
0.2%
)72
 
0.2%
-67
 
0.2%
,8
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII233775
98.8%
None2753
 
1.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
37472
16.0%
O33341
14.3%
E20068
 
8.6%
A19009
 
8.1%
N15097
 
6.5%
M14283
 
6.1%
T12419
 
5.3%
U11559
 
4.9%
S9485
 
4.1%
C8858
 
3.8%
Other values (22)52184
22.3%
None
ValueCountFrequency (%)
Ã1484
53.9%
Á380
 
13.8%
Ç357
 
13.0%
É166
 
6.0%
Í121
 
4.4%
Ó114
 
4.1%
Ô96
 
3.5%
Õ15
 
0.5%
Ê12
 
0.4%
Â8
 
0.3%

taxonomia_tipo_icao
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct32
Distinct (%)0.6%
Missing1
Missing (%)< 0.1%
Memory size42.4 KiB
SCF-NP
1723 
SCF-PP
770 
OTHR
600 
LOC-G
386 
LOC-I
327 
Other values (27)
1610 

Length

Max length7
Median length6
Mean length4.879431315
Min length2

Characters and Unicode

Total characters26427
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowSCF-NP
2nd rowOTHR
3rd rowSCF-PP
4th rowSCF-PP
5th rowLOC-G

Common Values

ValueCountFrequency (%)
SCF-NP1723
31.8%
SCF-PP770
14.2%
OTHR600
 
11.1%
LOC-G386
 
7.1%
LOC-I327
 
6.0%
RE247
 
4.6%
BIRD199
 
3.7%
ARC191
 
3.5%
CTOL173
 
3.2%
UNK126
 
2.3%
Other values (22)674
 
12.4%

Length

2022-05-29T01:55:21.249299image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
scf-np1723
31.8%
scf-pp770
14.2%
othr600
 
11.1%
loc-g386
 
7.1%
loc-i327
 
6.0%
re247
 
4.6%
bird199
 
3.7%
arc191
 
3.5%
ctol173
 
3.2%
unk126
 
2.3%
Other values (22)674
 
12.4%

Most occurring characters

ValueCountFrequency (%)
C3862
14.6%
P3276
12.4%
-3262
12.3%
F2680
10.1%
S2546
9.6%
N1926
7.3%
O1626
6.2%
L1321
 
5.0%
R1309
 
5.0%
T932
 
3.5%
Other values (15)3687
14.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter23131
87.5%
Dash Punctuation3262
 
12.3%
Open Punctuation13
 
< 0.1%
Close Punctuation13
 
< 0.1%
Other Punctuation8
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C3862
16.7%
P3276
14.2%
F2680
11.6%
S2546
11.0%
N1926
8.3%
O1626
7.0%
L1321
 
5.7%
R1309
 
5.7%
T932
 
4.0%
I684
 
3.0%
Other values (11)2969
12.8%
Dash Punctuation
ValueCountFrequency (%)
-3262
100.0%
Open Punctuation
ValueCountFrequency (%)
[13
100.0%
Close Punctuation
ValueCountFrequency (%)
]13
100.0%
Other Punctuation
ValueCountFrequency (%)
/8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin23131
87.5%
Common3296
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
C3862
16.7%
P3276
14.2%
F2680
11.6%
S2546
11.0%
N1926
8.3%
O1626
7.0%
L1321
 
5.7%
R1309
 
5.7%
T932
 
4.0%
I684
 
3.0%
Other values (11)2969
12.8%
Common
ValueCountFrequency (%)
-3262
99.0%
[13
 
0.4%
]13
 
0.4%
/8
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII26427
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C3862
14.6%
P3276
12.4%
-3262
12.3%
F2680
10.1%
S2546
9.6%
N1926
7.3%
O1626
6.2%
L1321
 
5.0%
R1309
 
5.0%
T932
 
3.5%
Other values (15)3687
14.0%

aeronave_matricula
Categorical

HIGH CARDINALITY
UNIFORM

Distinct3901
Distinct (%)72.2%
Missing14
Missing (%)0.3%
Memory size42.4 KiB
PRTTW
 
10
PRTTP
 
9
PRTTK
 
9
PRATV
 
9
PRFLM
 
9
Other values (3896)
5357 

Length

Max length8
Median length5
Mean length5.019433648
Min length5

Characters and Unicode

Total characters27120
Distinct characters36
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2968 ?
Unique (%)54.9%

Sample

1st rowPRCDL
2nd rowPRTKB
3rd rowPTGOO
4th rowPUUSS
5th rowPTUCL

Common Values

ValueCountFrequency (%)
PRTTW10
 
0.2%
PRTTP9
 
0.2%
PRTTK9
 
0.2%
PRATV9
 
0.2%
PRFLM9
 
0.2%
PRAYN8
 
0.1%
PPPTQ8
 
0.1%
PTLSJ8
 
0.1%
PPFXH8
 
0.1%
PREJI7
 
0.1%
Other values (3891)5318
98.2%
(Missing)14
 
0.3%

Length

2022-05-29T01:55:21.330668image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
prttw10
 
0.2%
prttk9
 
0.2%
pratv9
 
0.2%
prflm9
 
0.2%
prttp9
 
0.2%
prayn8
 
0.1%
ppptq8
 
0.1%
ptlsj8
 
0.1%
ppfxh8
 
0.1%
pppto7
 
0.1%
Other values (3891)5318
98.4%

Most occurring characters

ValueCountFrequency (%)
P6662
24.6%
R3014
 
11.1%
T2623
 
9.7%
A1218
 
4.5%
U887
 
3.3%
M836
 
3.1%
G827
 
3.0%
E746
 
2.8%
O722
 
2.7%
C705
 
2.6%
Other values (26)8880
32.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter26874
99.1%
Decimal Number246
 
0.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P6662
24.8%
R3014
 
11.2%
T2623
 
9.8%
A1218
 
4.5%
U887
 
3.3%
M836
 
3.1%
G827
 
3.1%
E746
 
2.8%
O722
 
2.7%
C705
 
2.6%
Other values (16)8634
32.1%
Decimal Number
ValueCountFrequency (%)
137
15.0%
232
13.0%
329
11.8%
525
10.2%
023
9.3%
622
8.9%
721
8.5%
921
8.5%
420
8.1%
816
6.5%

Most occurring scripts

ValueCountFrequency (%)
Latin26874
99.1%
Common246
 
0.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
P6662
24.8%
R3014
 
11.2%
T2623
 
9.8%
A1218
 
4.5%
U887
 
3.3%
M836
 
3.1%
G827
 
3.1%
E746
 
2.8%
O722
 
2.7%
C705
 
2.6%
Other values (16)8634
32.1%
Common
ValueCountFrequency (%)
137
15.0%
232
13.0%
329
11.8%
525
10.2%
023
9.3%
622
8.9%
721
8.5%
921
8.5%
420
8.1%
816
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII27120
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P6662
24.6%
R3014
 
11.1%
T2623
 
9.7%
A1218
 
4.5%
U887
 
3.3%
M836
 
3.1%
G827
 
3.0%
E746
 
2.8%
O722
 
2.7%
C705
 
2.6%
Other values (26)8880
32.7%

aeronave_operador_categoria
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct10
Distinct (%)0.4%
Missing3128
Missing (%)57.7%
Memory size42.4 KiB
REGULAR
636 
PARTICULAR
589 
TÁXI AÉREO
325 
INSTRUÇÃO
303 
EXPERIMENTAL
283 
Other values (5)
153 

Length

Max length20
Median length13
Mean length9.678462211
Min length7

Characters and Unicode

Total characters22154
Distinct characters24
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPARTICULAR
2nd rowEXPERIMENTAL
3rd rowPARTICULAR
4th rowREGULAR
5th rowPARTICULAR

Common Values

ValueCountFrequency (%)
REGULAR636
 
11.7%
PARTICULAR589
 
10.9%
TÁXI AÉREO325
 
6.0%
INSTRUÇÃO303
 
5.6%
EXPERIMENTAL283
 
5.2%
ADMINISTRAÇÃO DIRETA89
 
1.6%
AGRÍCOLA21
 
0.4%
ESPECIALIZADA21
 
0.4%
NÃO REGULAR14
 
0.3%
MÚLTIPLA8
 
0.1%
(Missing)3128
57.7%

Length

2022-05-29T01:55:21.433673image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-29T01:55:21.537587image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
regular650
23.9%
particular589
21.7%
táxi325
12.0%
aéreo325
12.0%
instrução303
11.2%
experimental283
10.4%
administração89
 
3.3%
direta89
 
3.3%
agrícola21
 
0.8%
especializada21
 
0.8%
Other values (2)22
 
0.8%

Most occurring characters

ValueCountFrequency (%)
R3588
16.2%
A2816
12.7%
E1955
8.8%
I1817
 
8.2%
T1686
 
7.6%
L1580
 
7.1%
U1542
 
7.0%
P901
 
4.1%
O752
 
3.4%
N689
 
3.1%
Other values (14)4828
21.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter21726
98.1%
Space Separator428
 
1.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R3588
16.5%
A2816
13.0%
E1955
9.0%
I1817
8.4%
T1686
 
7.8%
L1580
 
7.3%
U1542
 
7.1%
P901
 
4.1%
O752
 
3.5%
N689
 
3.2%
Other values (13)4400
20.3%
Space Separator
ValueCountFrequency (%)
428
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin21726
98.1%
Common428
 
1.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
R3588
16.5%
A2816
13.0%
E1955
9.0%
I1817
8.4%
T1686
 
7.8%
L1580
 
7.3%
U1542
 
7.1%
P901
 
4.1%
O752
 
3.5%
N689
 
3.2%
Other values (13)4400
20.3%
Common
ValueCountFrequency (%)
428
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII20677
93.3%
None1477
 
6.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R3588
17.4%
A2816
13.6%
E1955
9.5%
I1817
8.8%
T1686
8.2%
L1580
7.6%
U1542
7.5%
P901
 
4.4%
O752
 
3.6%
N689
 
3.3%
Other values (8)3351
16.2%
None
ValueCountFrequency (%)
Ã406
27.5%
Ç392
26.5%
É325
22.0%
Á325
22.0%
Í21
 
1.4%
Ú8
 
0.5%

aeronave_tipo_veiculo
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct9
Distinct (%)0.2%
Missing159
Missing (%)2.9%
Memory size42.4 KiB
AVIÃO
4346 
HELICÓPTERO
556 
ULTRALEVE
 
316
PLANADOR
 
18
ANFÍBIO
 
13
Other values (4)
 
9

Length

Max length11
Median length5
Mean length5.892544694
Min length5

Characters and Unicode

Total characters30983
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowAVIÃO
2nd rowAVIÃO
3rd rowAVIÃO
4th rowULTRALEVE
5th rowAVIÃO

Common Values

ValueCountFrequency (%)
AVIÃO4346
80.2%
HELICÓPTERO556
 
10.3%
ULTRALEVE316
 
5.8%
PLANADOR18
 
0.3%
ANFÍBIO13
 
0.2%
TRIKE5
 
0.1%
DIRIGÍVEL2
 
< 0.1%
BALÃO1
 
< 0.1%
HIDROAVIÃO1
 
< 0.1%
(Missing)159
 
2.9%

Length

2022-05-29T01:55:21.655653image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-29T01:55:21.764528image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
avião4346
82.7%
helicóptero556
 
10.6%
ultraleve316
 
6.0%
planador18
 
0.3%
anfíbio13
 
0.2%
trike5
 
0.1%
dirigível2
 
< 0.1%
balão1
 
< 0.1%
hidroavião1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
O4936
15.9%
I4926
15.9%
A4713
15.2%
V4665
15.1%
Ã4348
14.0%
E1751
 
5.7%
L1209
 
3.9%
R898
 
2.9%
T877
 
2.8%
P574
 
1.9%
Other values (11)2086
6.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter30983
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O4936
15.9%
I4926
15.9%
A4713
15.2%
V4665
15.1%
Ã4348
14.0%
E1751
 
5.7%
L1209
 
3.9%
R898
 
2.9%
T877
 
2.8%
P574
 
1.9%
Other values (11)2086
6.7%

Most occurring scripts

ValueCountFrequency (%)
Latin30983
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
O4936
15.9%
I4926
15.9%
A4713
15.2%
V4665
15.1%
Ã4348
14.0%
E1751
 
5.7%
L1209
 
3.9%
R898
 
2.9%
T877
 
2.8%
P574
 
1.9%
Other values (11)2086
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII26064
84.1%
None4919
 
15.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O4936
18.9%
I4926
18.9%
A4713
18.1%
V4665
17.9%
E1751
 
6.7%
L1209
 
4.6%
R898
 
3.4%
T877
 
3.4%
P574
 
2.2%
H557
 
2.1%
Other values (8)958
 
3.7%
None
ValueCountFrequency (%)
Ã4348
88.4%
Ó556
 
11.3%
Í15
 
0.3%

aeronave_fabricante
Categorical

HIGH CARDINALITY
MISSING

Distinct233
Distinct (%)4.6%
Missing358
Missing (%)6.6%
Memory size42.4 KiB
CESSNA AIRCRAFT
862 
NEIVA INDUSTRIA AERONAUTICA
607 
EMBRAER
600 
PIPER AIRCRAFT
401 
BOEING COMPANY
268 
Other values (228)
2321 

Length

Max length47
Median length39
Mean length15.77406602
Min length2

Characters and Unicode

Total characters79801
Distinct characters44
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique135 ?
Unique (%)2.7%

Sample

1st rowRAYTHEON AIRCRAFT
2nd rowAEROSPATIALE AND ALENIA
3rd rowNEIVA INDUSTRIA AERONAUTICA
4th rowNEIVA INDUSTRIA AERONAUTICA
5th rowEUROCOPTER FRANCE

Common Values

ValueCountFrequency (%)
CESSNA AIRCRAFT862
15.9%
NEIVA INDUSTRIA AERONAUTICA607
11.2%
EMBRAER600
11.1%
PIPER AIRCRAFT401
 
7.4%
BOEING COMPANY268
 
4.9%
BEECH AIRCRAFT266
 
4.9%
AIRBUS INDUSTRIE253
 
4.7%
AEROSPATIALE AND ALENIA249
 
4.6%
AERO BOERO160
 
3.0%
ROBINSON HELICOPTER132
 
2.4%
Other values (223)1261
23.3%
(Missing)358
 
6.6%

Length

2022-05-29T01:55:21.876336image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
aircraft1769
16.9%
cessna862
 
8.3%
aeronautica659
 
6.3%
industria646
 
6.2%
neiva607
 
5.8%
embraer600
 
5.7%
piper401
 
3.8%
company271
 
2.6%
boeing269
 
2.6%
beech267
 
2.6%
Other values (425)4091
39.2%

Most occurring characters

ValueCountFrequency (%)
A12075
15.1%
R9518
11.9%
I7712
9.7%
E7668
9.6%
5385
 
6.7%
N4976
 
6.2%
C4783
 
6.0%
T4712
 
5.9%
S4211
 
5.3%
O3417
 
4.3%
Other values (34)15344
19.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter74336
93.2%
Space Separator5385
 
6.7%
Other Punctuation57
 
0.1%
Dash Punctuation17
 
< 0.1%
Decimal Number5
 
< 0.1%
Lowercase Letter1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A12075
16.2%
R9518
12.8%
I7712
10.4%
E7668
10.3%
N4976
 
6.7%
C4783
 
6.4%
T4712
 
6.3%
S4211
 
5.7%
O3417
 
4.6%
U2319
 
3.1%
Other values (23)12945
17.4%
Other Punctuation
ValueCountFrequency (%)
.49
86.0%
&3
 
5.3%
'2
 
3.5%
\1
 
1.8%
1
 
1.8%
/1
 
1.8%
Decimal Number
ValueCountFrequency (%)
53
60.0%
72
40.0%
Space Separator
ValueCountFrequency (%)
5385
100.0%
Dash Punctuation
ValueCountFrequency (%)
-17
100.0%
Lowercase Letter
ValueCountFrequency (%)
t1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin74337
93.2%
Common5464
 
6.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A12075
16.2%
R9518
12.8%
I7712
10.4%
E7668
10.3%
N4976
 
6.7%
C4783
 
6.4%
T4712
 
6.3%
S4211
 
5.7%
O3417
 
4.6%
U2319
 
3.1%
Other values (24)12946
17.4%
Common
ValueCountFrequency (%)
5385
98.6%
.49
 
0.9%
-17
 
0.3%
&3
 
0.1%
53
 
0.1%
72
 
< 0.1%
'2
 
< 0.1%
\1
 
< 0.1%
1
 
< 0.1%
/1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII79767
> 99.9%
None33
 
< 0.1%
Punctuation1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A12075
15.1%
R9518
11.9%
I7712
9.7%
E7668
9.6%
5385
 
6.8%
N4976
 
6.2%
C4783
 
6.0%
T4712
 
5.9%
S4211
 
5.3%
O3417
 
4.3%
Other values (26)15310
19.2%
None
ValueCountFrequency (%)
Á14
42.4%
Ú8
24.2%
Ã4
 
12.1%
Â3
 
9.1%
É2
 
6.1%
Ç1
 
3.0%
Ó1
 
3.0%
Punctuation
ValueCountFrequency (%)
1
100.0%

aeronave_modelo
Categorical

HIGH CARDINALITY
MISSING

Distinct737
Distinct (%)14.1%
Missing175
Missing (%)3.2%
Memory size42.4 KiB
ATR-72-212A
 
209
ERJ 190-200 IGW
 
174
EMB-810D
 
149
152
 
144
AB-115
 
140
Other values (732)
4426 

Length

Max length29
Median length19
Mean length7.225295689
Min length2

Characters and Unicode

Total characters37875
Distinct characters62
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique353 ?
Unique (%)6.7%

Sample

1st row58
2nd rowATR-42-500
3rd rowEMB-201
4th rowP2004 BRAVO
5th rowEMB-201A

Common Values

ValueCountFrequency (%)
ATR-72-212A209
 
3.9%
ERJ 190-200 IGW174
 
3.2%
EMB-810D149
 
2.8%
152144
 
2.7%
AB-115140
 
2.6%
EMB-810C119
 
2.2%
EMB-20295
 
1.8%
737-8EH93
 
1.7%
EMB-201A93
 
1.7%
EMB-202A89
 
1.6%
Other values (727)3937
72.7%
(Missing)175
 
3.2%

Length

2022-05-29T01:55:22.009451image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
erj248
 
3.8%
igw213
 
3.3%
atr-72-212a209
 
3.2%
190-200175
 
2.7%
emb-810d149
 
2.3%
152144
 
2.2%
ab-115140
 
2.1%
as128
 
2.0%
emb-810c119
 
1.8%
350109
 
1.7%
Other values (814)4908
75.0%

Most occurring characters

ValueCountFrequency (%)
24025
 
10.6%
-3777
 
10.0%
03473
 
9.2%
13061
 
8.1%
A2627
 
6.9%
B1774
 
4.7%
E1758
 
4.6%
51461
 
3.9%
31373
 
3.6%
71339
 
3.5%
Other values (52)13207
34.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number17603
46.5%
Uppercase Letter15121
39.9%
Dash Punctuation3777
 
10.0%
Space Separator1300
 
3.4%
Lowercase Letter35
 
0.1%
Other Punctuation23
 
0.1%
Math Symbol8
 
< 0.1%
Close Punctuation4
 
< 0.1%
Open Punctuation4
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A2627
17.4%
B1774
11.7%
E1758
11.6%
R1214
 
8.0%
M1121
 
7.4%
T865
 
5.7%
C731
 
4.8%
P708
 
4.7%
S631
 
4.2%
I542
 
3.6%
Other values (18)3150
20.8%
Lowercase Letter
ValueCountFrequency (%)
r6
17.1%
a3
8.6%
v3
8.6%
o3
8.6%
e3
8.6%
u3
8.6%
s2
 
5.7%
i2
 
5.7%
b2
 
5.7%
k1
 
2.9%
Other values (7)7
20.0%
Decimal Number
ValueCountFrequency (%)
24025
22.9%
03473
19.7%
13061
17.4%
51461
 
8.3%
31373
 
7.8%
71339
 
7.6%
81099
 
6.2%
4754
 
4.3%
9511
 
2.9%
6507
 
2.9%
Other Punctuation
ValueCountFrequency (%)
/16
69.6%
.7
30.4%
Dash Punctuation
ValueCountFrequency (%)
-3777
100.0%
Space Separator
ValueCountFrequency (%)
1300
100.0%
Math Symbol
ValueCountFrequency (%)
+8
100.0%
Close Punctuation
ValueCountFrequency (%)
)4
100.0%
Open Punctuation
ValueCountFrequency (%)
(4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common22719
60.0%
Latin15156
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A2627
17.3%
B1774
11.7%
E1758
11.6%
R1214
 
8.0%
M1121
 
7.4%
T865
 
5.7%
C731
 
4.8%
P708
 
4.7%
S631
 
4.2%
I542
 
3.6%
Other values (35)3185
21.0%
Common
ValueCountFrequency (%)
24025
17.7%
-3777
16.6%
03473
15.3%
13061
13.5%
51461
 
6.4%
31373
 
6.0%
71339
 
5.9%
1300
 
5.7%
81099
 
4.8%
4754
 
3.3%
Other values (7)1057
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII37873
> 99.9%
None2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
24025
 
10.6%
-3777
 
10.0%
03473
 
9.2%
13061
 
8.1%
A2627
 
6.9%
B1774
 
4.7%
E1758
 
4.6%
51461
 
3.9%
31373
 
3.6%
71339
 
3.5%
Other values (50)13205
34.9%
None
ValueCountFrequency (%)
Á1
50.0%
É1
50.0%

aeronave_tipo_icao
Categorical

HIGH CARDINALITY
MISSING

Distinct229
Distinct (%)4.5%
Missing271
Missing (%)5.0%
Memory size42.4 KiB
PA34
389 
IPAN
 
312
ULAC
 
293
E190
 
229
AT72
 
204
Other values (224)
3719 

Length

Max length5
Median length4
Mean length3.938981733
Min length2

Characters and Unicode

Total characters20270
Distinct characters36
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)1.0%

Sample

1st rowBE58
2nd rowAT45
3rd rowIPAN
4th rowULAC
5th rowIPAN

Common Values

ValueCountFrequency (%)
PA34389
 
7.2%
IPAN312
 
5.8%
ULAC293
 
5.4%
E190229
 
4.2%
AT72204
 
3.8%
A320150
 
2.8%
AB11140
 
2.6%
C152140
 
2.6%
ZZZZ122
 
2.3%
AS50120
 
2.2%
Other values (219)3047
56.2%
(Missing)271
 
5.0%

Length

2022-05-29T01:55:22.101689image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
pa34389
 
7.6%
ipan312
 
6.1%
ulac293
 
5.7%
e190229
 
4.4%
at72204
 
4.0%
a320150
 
2.9%
ab11140
 
2.7%
c152140
 
2.7%
zzzz122
 
2.4%
as50120
 
2.3%
Other values (220)3048
59.2%

Most occurring characters

ValueCountFrequency (%)
A2442
 
12.0%
21951
 
9.6%
11484
 
7.3%
P1417
 
7.0%
31283
 
6.3%
C1230
 
6.1%
01221
 
6.0%
5990
 
4.9%
8866
 
4.3%
B862
 
4.3%
Other values (26)6524
32.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter10243
50.5%
Decimal Number10026
49.5%
Space Separator1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A2442
23.8%
P1417
13.8%
C1230
12.0%
B862
 
8.4%
E719
 
7.0%
T508
 
5.0%
L507
 
4.9%
Z489
 
4.8%
U382
 
3.7%
S358
 
3.5%
Other values (15)1329
13.0%
Decimal Number
ValueCountFrequency (%)
21951
19.5%
11484
14.8%
31283
12.8%
01221
12.2%
5990
9.9%
8866
8.6%
7745
 
7.4%
4727
 
7.3%
9429
 
4.3%
6330
 
3.3%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin10243
50.5%
Common10027
49.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A2442
23.8%
P1417
13.8%
C1230
12.0%
B862
 
8.4%
E719
 
7.0%
T508
 
5.0%
L507
 
4.9%
Z489
 
4.8%
U382
 
3.7%
S358
 
3.5%
Other values (15)1329
13.0%
Common
ValueCountFrequency (%)
21951
19.5%
11484
14.8%
31283
12.8%
01221
12.2%
5990
9.9%
8866
8.6%
7745
 
7.4%
4727
 
7.3%
9429
 
4.3%
6330
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII20270
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A2442
 
12.0%
21951
 
9.6%
11484
 
7.3%
P1417
 
7.0%
31283
 
6.3%
C1230
 
6.1%
01221
 
6.0%
5990
 
4.9%
8866
 
4.3%
B862
 
4.3%
Other values (26)6524
32.2%

aeronave_motor_tipo
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)0.1%
Missing237
Missing (%)4.4%
Memory size42.4 KiB
PISTÃO
3013 
JATO
978 
TURBOÉLICE
695 
TURBOEIXO
476 
SEM TRAÇÃO
 
18

Length

Max length10
Median length6
Mean length6.448648649
Min length4

Characters and Unicode

Total characters33404
Distinct characters19
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPISTÃO
2nd rowTURBOÉLICE
3rd rowPISTÃO
4th rowPISTÃO
5th rowPISTÃO

Common Values

ValueCountFrequency (%)
PISTÃO3013
55.6%
JATO978
 
18.1%
TURBOÉLICE695
 
12.8%
TURBOEIXO476
 
8.8%
SEM TRAÇÃO18
 
0.3%
(Missing)237
 
4.4%

Length

2022-05-29T01:55:22.181448image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-29T01:55:22.286683image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
pistão3013
58.0%
jato978
 
18.8%
turboélice695
 
13.4%
turboeixo476
 
9.2%
sem18
 
0.3%
tração18
 
0.3%

Most occurring characters

ValueCountFrequency (%)
O5656
16.9%
T5180
15.5%
I4184
12.5%
S3031
9.1%
Ã3031
9.1%
P3013
9.0%
E1189
 
3.6%
R1189
 
3.6%
U1171
 
3.5%
B1171
 
3.5%
Other values (9)4589
13.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter33386
99.9%
Space Separator18
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O5656
16.9%
T5180
15.5%
I4184
12.5%
S3031
9.1%
Ã3031
9.1%
P3013
9.0%
E1189
 
3.6%
R1189
 
3.6%
U1171
 
3.5%
B1171
 
3.5%
Other values (8)4571
13.7%
Space Separator
ValueCountFrequency (%)
18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin33386
99.9%
Common18
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
O5656
16.9%
T5180
15.5%
I4184
12.5%
S3031
9.1%
Ã3031
9.1%
P3013
9.0%
E1189
 
3.6%
R1189
 
3.6%
U1171
 
3.5%
B1171
 
3.5%
Other values (8)4571
13.7%
Common
ValueCountFrequency (%)
18
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII29660
88.8%
None3744
 
11.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O5656
19.1%
T5180
17.5%
I4184
14.1%
S3031
10.2%
P3013
10.2%
E1189
 
4.0%
R1189
 
4.0%
U1171
 
3.9%
B1171
 
3.9%
A996
 
3.4%
Other values (6)2880
9.7%
None
ValueCountFrequency (%)
Ã3031
81.0%
É695
 
18.6%
Ç18
 
0.5%

aeronave_motor_quantidade
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)0.1%
Missing94
Missing (%)1.7%
Memory size42.4 KiB
MONOMOTOR
2802 
BIMOTOR
2277 
SEM TRAÇÃO
 
171
TRIMOTOR
 
68
QUADRIMOTOR
 
5

Length

Max length11
Median length9
Mean length8.165696036
Min length7

Characters and Unicode

Total characters43466
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBIMOTOR
2nd rowBIMOTOR
3rd rowMONOMOTOR
4th rowMONOMOTOR
5th rowMONOMOTOR

Common Values

ValueCountFrequency (%)
MONOMOTOR2802
51.7%
BIMOTOR2277
42.0%
SEM TRAÇÃO171
 
3.2%
TRIMOTOR68
 
1.3%
QUADRIMOTOR5
 
0.1%
(Missing)94
 
1.7%

Length

2022-05-29T01:55:22.381602image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-29T01:55:22.692294image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
monomotor2802
51.0%
bimotor2277
41.4%
sem171
 
3.1%
tração171
 
3.1%
trimotor68
 
1.2%
quadrimotor5
 
0.1%

Most occurring characters

ValueCountFrequency (%)
O16079
37.0%
M8125
18.7%
R5396
 
12.4%
T5391
 
12.4%
N2802
 
6.4%
I2350
 
5.4%
B2277
 
5.2%
A176
 
0.4%
S171
 
0.4%
E171
 
0.4%
Other values (6)528
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter43295
99.6%
Space Separator171
 
0.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O16079
37.1%
M8125
18.8%
R5396
 
12.5%
T5391
 
12.5%
N2802
 
6.5%
I2350
 
5.4%
B2277
 
5.3%
A176
 
0.4%
S171
 
0.4%
E171
 
0.4%
Other values (5)357
 
0.8%
Space Separator
ValueCountFrequency (%)
171
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin43295
99.6%
Common171
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
O16079
37.1%
M8125
18.8%
R5396
 
12.5%
T5391
 
12.5%
N2802
 
6.5%
I2350
 
5.4%
B2277
 
5.3%
A176
 
0.4%
S171
 
0.4%
E171
 
0.4%
Other values (5)357
 
0.8%
Common
ValueCountFrequency (%)
171
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII43124
99.2%
None342
 
0.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O16079
37.3%
M8125
18.8%
R5396
 
12.5%
T5391
 
12.5%
N2802
 
6.5%
I2350
 
5.4%
B2277
 
5.3%
A176
 
0.4%
S171
 
0.4%
E171
 
0.4%
Other values (4)186
 
0.4%
None
ValueCountFrequency (%)
Ç171
50.0%
Ã171
50.0%

aeronave_pmd
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct459
Distinct (%)8.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14432.68599
Minimum0
Maximum396895
Zeros241
Zeros (%)4.4%
Negative0
Negative (%)0.0%
Memory size42.4 KiB
2022-05-29T01:55:22.788916image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile408
Q11157
median1968
Q35307
95-th percentile77000
Maximum396895
Range396895
Interquartile range (IQR)4150

Descriptive statistics

Standard deviation33659.10235
Coefficient of variation (CV)2.332144022
Kurtosis34.26910004
Mean14432.68599
Median Absolute Deviation (MAD)1211
Skewness4.841505879
Sum78181860
Variance1132935171
MonotonicityNot monotonic
2022-05-29T01:55:22.897786image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1800300
 
5.5%
0241
 
4.4%
2155212
 
3.9%
52290175
 
3.2%
1633165
 
3.0%
2073140
 
2.6%
770140
 
2.6%
23000133
 
2.5%
757117
 
2.2%
1542110
 
2.0%
Other values (449)3684
68.0%
ValueCountFrequency (%)
0241
4.4%
351
 
< 0.1%
2081
 
< 0.1%
2503
 
0.1%
2802
 
< 0.1%
3001
 
< 0.1%
3081
 
< 0.1%
3401
 
< 0.1%
3421
 
< 0.1%
3521
 
< 0.1%
ValueCountFrequency (%)
3968951
 
< 0.1%
3855531
 
< 0.1%
3807901
 
< 0.1%
3515341
 
< 0.1%
3465447
0.1%
2859904
0.1%
2630801
 
< 0.1%
2562801
 
< 0.1%
2530001
 
< 0.1%
2472001
 
< 0.1%

aeronave_pmd_categoria
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct459
Distinct (%)8.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14432.68599
Minimum0
Maximum396895
Zeros241
Zeros (%)4.4%
Negative0
Negative (%)0.0%
Memory size42.4 KiB
2022-05-29T01:55:23.028823image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile408
Q11157
median1968
Q35307
95-th percentile77000
Maximum396895
Range396895
Interquartile range (IQR)4150

Descriptive statistics

Standard deviation33659.10235
Coefficient of variation (CV)2.332144022
Kurtosis34.26910004
Mean14432.68599
Median Absolute Deviation (MAD)1211
Skewness4.841505879
Sum78181860
Variance1132935171
MonotonicityNot monotonic
2022-05-29T01:55:23.137021image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1800300
 
5.5%
0241
 
4.4%
2155212
 
3.9%
52290175
 
3.2%
1633165
 
3.0%
2073140
 
2.6%
770140
 
2.6%
23000133
 
2.5%
757117
 
2.2%
1542110
 
2.0%
Other values (449)3684
68.0%
ValueCountFrequency (%)
0241
4.4%
351
 
< 0.1%
2081
 
< 0.1%
2503
 
0.1%
2802
 
< 0.1%
3001
 
< 0.1%
3081
 
< 0.1%
3401
 
< 0.1%
3421
 
< 0.1%
3521
 
< 0.1%
ValueCountFrequency (%)
3968951
 
< 0.1%
3855531
 
< 0.1%
3807901
 
< 0.1%
3515341
 
< 0.1%
3465447
0.1%
2859904
0.1%
2630801
 
< 0.1%
2562801
 
< 0.1%
2530001
 
< 0.1%
2472001
 
< 0.1%

aeronave_assentos
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct86
Distinct (%)1.6%
Missing199
Missing (%)3.7%
Infinite0
Infinite (%)0.0%
Mean26.72729015
Minimum0
Maximum384
Zeros258
Zeros (%)4.8%
Negative0
Negative (%)0.0%
Memory size42.4 KiB
2022-05-29T01:55:23.281591image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median6
Q310
95-th percentile172
Maximum384
Range384
Interquartile range (IQR)8

Descriptive statistics

Standard deviation53.80916901
Coefficient of variation (CV)2.013266916
Kurtosis6.249536922
Mean26.72729015
Median Absolute Deviation (MAD)4
Skewness2.530606815
Sum139463
Variance2895.426669
MonotonicityNot monotonic
2022-05-29T01:55:23.398308image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6900
16.6%
2806
14.9%
4626
11.6%
1579
10.7%
7314
 
5.8%
0258
 
4.8%
125171
 
3.2%
8165
 
3.0%
10139
 
2.6%
11129
 
2.4%
Other values (76)1131
20.9%
(Missing)199
 
3.7%
ValueCountFrequency (%)
0258
 
4.8%
1579
10.7%
2806
14.9%
370
 
1.3%
4626
11.6%
5114
 
2.1%
6900
16.6%
7314
 
5.8%
8165
 
3.0%
950
 
0.9%
ValueCountFrequency (%)
3845
0.1%
3822
 
< 0.1%
3121
 
< 0.1%
2882
 
< 0.1%
2842
 
< 0.1%
2782
 
< 0.1%
2581
 
< 0.1%
2431
 
< 0.1%
2421
 
< 0.1%
2404
0.1%

aeronave_ano_fabricacao
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct78
Distinct (%)1.6%
Missing520
Missing (%)9.6%
Infinite0
Infinite (%)0.0%
Mean1993.702062
Minimum1936
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size42.4 KiB
2022-05-29T01:55:23.511838image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1936
5-th percentile1969
Q11980
median1996
Q32009
95-th percentile2013
Maximum2020
Range84
Interquartile range (IQR)29

Descriptive statistics

Standard deviation15.8722452
Coefficient of variation (CV)0.007961192146
Kurtosis-0.7836127684
Mean1993.702062
Median Absolute Deviation (MAD)14
Skewness-0.4209224121
Sum9763159
Variance251.9281677
MonotonicityNot monotonic
2022-05-29T01:55:23.625399image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2012257
 
4.7%
2010235
 
4.3%
2011231
 
4.3%
2009201
 
3.7%
2008189
 
3.5%
1980180
 
3.3%
2007176
 
3.2%
2013171
 
3.2%
1979141
 
2.6%
1977134
 
2.5%
Other values (68)2982
55.0%
(Missing)520
 
9.6%
ValueCountFrequency (%)
19361
 
< 0.1%
19401
 
< 0.1%
19452
 
< 0.1%
194610
0.2%
19477
0.1%
19485
 
0.1%
19491
 
< 0.1%
195015
0.3%
19515
 
0.1%
19523
 
0.1%
ValueCountFrequency (%)
20203
 
0.1%
201912
 
0.2%
201818
 
0.3%
201721
 
0.4%
201634
 
0.6%
201544
 
0.8%
201435
 
0.6%
2013171
3.2%
2012257
4.7%
2011231
4.3%

aeronave_registro_categoria
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct9
Distinct (%)0.2%
Missing159
Missing (%)2.9%
Memory size42.4 KiB
AVIÃO
4346 
HELICÓPTERO
556 
ULTRALEVE
 
316
PLANADOR
 
18
ANFÍBIO
 
13
Other values (4)
 
9

Length

Max length11
Median length5
Mean length5.892544694
Min length5

Characters and Unicode

Total characters30983
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowAVIÃO
2nd rowAVIÃO
3rd rowAVIÃO
4th rowULTRALEVE
5th rowAVIÃO

Common Values

ValueCountFrequency (%)
AVIÃO4346
80.2%
HELICÓPTERO556
 
10.3%
ULTRALEVE316
 
5.8%
PLANADOR18
 
0.3%
ANFÍBIO13
 
0.2%
TRIKE5
 
0.1%
DIRIGÍVEL2
 
< 0.1%
BALÃO1
 
< 0.1%
HIDROAVIÃO1
 
< 0.1%
(Missing)159
 
2.9%

Length

2022-05-29T01:55:23.724201image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-29T01:55:23.820722image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
avião4346
82.7%
helicóptero556
 
10.6%
ultraleve316
 
6.0%
planador18
 
0.3%
anfíbio13
 
0.2%
trike5
 
0.1%
dirigível2
 
< 0.1%
balão1
 
< 0.1%
hidroavião1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
O4936
15.9%
I4926
15.9%
A4713
15.2%
V4665
15.1%
Ã4348
14.0%
E1751
 
5.7%
L1209
 
3.9%
R898
 
2.9%
T877
 
2.8%
P574
 
1.9%
Other values (11)2086
6.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter30983
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O4936
15.9%
I4926
15.9%
A4713
15.2%
V4665
15.1%
Ã4348
14.0%
E1751
 
5.7%
L1209
 
3.9%
R898
 
2.9%
T877
 
2.8%
P574
 
1.9%
Other values (11)2086
6.7%

Most occurring scripts

ValueCountFrequency (%)
Latin30983
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
O4936
15.9%
I4926
15.9%
A4713
15.2%
V4665
15.1%
Ã4348
14.0%
E1751
 
5.7%
L1209
 
3.9%
R898
 
2.9%
T877
 
2.8%
P574
 
1.9%
Other values (11)2086
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII26064
84.1%
None4919
 
15.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O4936
18.9%
I4926
18.9%
A4713
18.1%
V4665
17.9%
E1751
 
6.7%
L1209
 
4.6%
R898
 
3.4%
T877
 
3.4%
P574
 
2.2%
H557
 
2.1%
Other values (8)958
 
3.7%
None
ValueCountFrequency (%)
Ã4348
88.4%
Ó556
 
11.3%
Í15
 
0.3%

aeronave_registro_segmento
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct12
Distinct (%)0.2%
Missing73
Missing (%)1.3%
Memory size42.4 KiB
PARTICULAR
1729 
REGULAR
1063 
INSTRUÇÃO
736 
TÁXI AÉREO
643 
EXPERIMENTAL
444 
Other values (7)
729 

Length

Max length22
Median length20
Mean length9.706212575
Min length7

Characters and Unicode

Total characters51870
Distinct characters26
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowPARTICULAR
2nd rowREGULAR
3rd rowESPECIALIZADA
4th rowEXPERIMENTAL
5th rowESPECIALIZADA

Common Values

ValueCountFrequency (%)
PARTICULAR1729
31.9%
REGULAR1063
19.6%
INSTRUÇÃO736
13.6%
TÁXI AÉREO643
 
11.9%
EXPERIMENTAL444
 
8.2%
AGRÍCOLA371
 
6.8%
ADMINISTRAÇÃO DIRETA188
 
3.5%
ESPECIALIZADA101
 
1.9%
MÚLTIPLA32
 
0.6%
NÃO REGULAR31
 
0.6%
Other values (2)6
 
0.1%
(Missing)73
 
1.3%

Length

2022-05-29T01:55:23.930844image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
particular1729
27.8%
regular1094
17.6%
instrução736
11.8%
táxi643
 
10.4%
aéreo643
 
10.4%
experimental444
 
7.1%
agrícola371
 
6.0%
administração193
 
3.1%
direta188
 
3.0%
especializada101
 
1.6%
Other values (4)69
 
1.1%

Most occurring characters

ValueCountFrequency (%)
R8227
15.9%
A7296
14.1%
I4372
8.4%
T3971
 
7.7%
L3803
 
7.3%
U3559
 
6.9%
E3464
 
6.7%
P2306
 
4.4%
C2202
 
4.2%
O1974
 
3.8%
Other values (16)10696
20.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter51003
98.3%
Space Separator867
 
1.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R8227
16.1%
A7296
14.3%
I4372
8.6%
T3971
7.8%
L3803
 
7.5%
U3559
 
7.0%
E3464
 
6.8%
P2306
 
4.5%
C2202
 
4.3%
O1974
 
3.9%
Other values (15)9829
19.3%
Space Separator
ValueCountFrequency (%)
867
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin51003
98.3%
Common867
 
1.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
R8227
16.1%
A7296
14.3%
I4372
8.6%
T3971
7.8%
L3803
 
7.5%
U3559
 
7.0%
E3464
 
6.8%
P2306
 
4.5%
C2202
 
4.3%
O1974
 
3.9%
Other values (15)9829
19.3%
Common
ValueCountFrequency (%)
867
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII48291
93.1%
None3579
 
6.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R8227
17.0%
A7296
15.1%
I4372
9.1%
T3971
8.2%
L3803
7.9%
U3559
7.4%
E3464
7.2%
P2306
 
4.8%
C2202
 
4.6%
O1974
 
4.1%
Other values (9)7117
14.7%
None
ValueCountFrequency (%)
Ã960
26.8%
Ç929
26.0%
Á643
18.0%
É643
18.0%
Í371
 
10.4%
Ú32
 
0.9%
Ó1
 
< 0.1%

aeronave_voo_origem
Categorical

HIGH CARDINALITY
MISSING

Distinct677
Distinct (%)13.0%
Missing204
Missing (%)3.8%
Memory size42.4 KiB
FORA DE AERODROMO
1915 
CAMPO DE MARTE
 
104
VIRACOPOS
 
102
GOVERNADOR ANDRÉ FRANCO MONTORO
 
101
CONGONHAS
 
73
Other values (672)
2918 

Length

Max length63
Median length50
Mean length17.83982352
Min length3

Characters and Unicode

Total characters92999
Distinct characters70
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique327 ?
Unique (%)6.3%

Sample

1st rowFORA DE AERODROMO
2nd rowFORA DE AERODROMO
3rd rowFORA DE AERODROMO
4th rowFORA DE AERODROMO
5th rowFORA DE AERODROMO

Common Values

ValueCountFrequency (%)
FORA DE AERODROMO1915
35.4%
CAMPO DE MARTE104
 
1.9%
VIRACOPOS102
 
1.9%
GOVERNADOR ANDRÉ FRANCO MONTORO101
 
1.9%
CONGONHAS73
 
1.3%
CARLOS DRUMMOND DE ANDRADE / PAMPULHA70
 
1.3%
PRESIDENTE JUSCELINO KUBITSCHEK65
 
1.2%
AEROPORTO ESTADUAL DE JUNDIAÍ64
 
1.2%
SANTA GENOVEVA/GOIÂNIA58
 
1.1%
BACACHERI55
 
1.0%
Other values (667)2606
48.1%
(Missing)204
 
3.8%

Length

2022-05-29T01:55:24.066457image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
de2563
 
17.6%
aerodromo1917
 
13.2%
fora1915
 
13.2%
235
 
1.6%
governador176
 
1.2%
carlos144
 
1.0%
campo137
 
0.9%
fazenda134
 
0.9%
estadual126
 
0.9%
santa113
 
0.8%
Other values (959)7093
48.7%

Most occurring characters

ValueCountFrequency (%)
O13568
14.6%
A11132
12.0%
R10719
11.5%
9358
10.1%
E8920
9.6%
D6779
 
7.3%
M3610
 
3.9%
N3271
 
3.5%
I3216
 
3.5%
S2983
 
3.2%
Other values (60)19443
20.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter83044
89.3%
Space Separator9358
 
10.1%
Other Punctuation259
 
0.3%
Lowercase Letter168
 
0.2%
Dash Punctuation119
 
0.1%
Decimal Number39
 
< 0.1%
Open Punctuation6
 
< 0.1%
Close Punctuation6
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O13568
16.3%
A11132
13.4%
R10719
12.9%
E8920
10.7%
D6779
8.2%
M3610
 
4.3%
N3271
 
3.9%
I3216
 
3.9%
S2983
 
3.6%
F2522
 
3.0%
Other values (27)16324
19.7%
Lowercase Letter
ValueCountFrequency (%)
o24
14.3%
r19
11.3%
a17
10.1%
e15
8.9%
c14
8.3%
n13
7.7%
s13
7.7%
t12
7.1%
i12
7.1%
l8
 
4.8%
Other values (9)21
12.5%
Decimal Number
ValueCountFrequency (%)
117
43.6%
415
38.5%
23
 
7.7%
52
 
5.1%
32
 
5.1%
Other Punctuation
ValueCountFrequency (%)
/238
91.9%
.14
 
5.4%
*6
 
2.3%
'1
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
-117
98.3%
2
 
1.7%
Space Separator
ValueCountFrequency (%)
9358
100.0%
Open Punctuation
ValueCountFrequency (%)
(6
100.0%
Close Punctuation
ValueCountFrequency (%)
)6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin83212
89.5%
Common9787
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
O13568
16.3%
A11132
13.4%
R10719
12.9%
E8920
10.7%
D6779
8.1%
M3610
 
4.3%
N3271
 
3.9%
I3216
 
3.9%
S2983
 
3.6%
F2522
 
3.0%
Other values (46)16492
19.8%
Common
ValueCountFrequency (%)
9358
95.6%
/238
 
2.4%
-117
 
1.2%
117
 
0.2%
415
 
0.2%
.14
 
0.1%
(6
 
0.1%
*6
 
0.1%
)6
 
0.1%
23
 
< 0.1%
Other values (4)7
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII91479
98.4%
None1518
 
1.6%
Punctuation2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O13568
14.8%
A11132
12.2%
R10719
11.7%
9358
10.2%
E8920
9.8%
D6779
 
7.4%
M3610
 
3.9%
N3271
 
3.6%
I3216
 
3.5%
S2983
 
3.3%
Other values (46)17923
19.6%
None
ValueCountFrequency (%)
Ã320
21.1%
É288
19.0%
Í262
17.3%
Á232
15.3%
Â111
 
7.3%
Ó99
 
6.5%
Ç89
 
5.9%
Ú75
 
4.9%
Ô20
 
1.3%
Ê12
 
0.8%
Other values (3)10
 
0.7%
Punctuation
ValueCountFrequency (%)
2
100.0%

aeronave_voo_destino
Categorical

HIGH CARDINALITY
MISSING

Distinct674
Distinct (%)12.9%
Missing196
Missing (%)3.6%
Memory size42.4 KiB
FORA DE AERODROMO
1990 
CAMPO DE MARTE
 
113
GOVERNADOR ANDRÉ FRANCO MONTORO
 
99
CONGONHAS
 
70
VIRACOPOS
 
65
Other values (669)
2884 

Length

Max length50
Median length46
Mean length17.71308179
Min length3

Characters and Unicode

Total characters92480
Distinct characters74
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique341 ?
Unique (%)6.5%

Sample

1st rowFORA DE AERODROMO
2nd rowFORA DE AERODROMO
3rd rowFORA DE AERODROMO
4th rowFORA DE AERODROMO
5th rowFORA DE AERODROMO

Common Values

ValueCountFrequency (%)
FORA DE AERODROMO1990
36.7%
CAMPO DE MARTE113
 
2.1%
GOVERNADOR ANDRÉ FRANCO MONTORO99
 
1.8%
CONGONHAS70
 
1.3%
VIRACOPOS65
 
1.2%
PRESIDENTE JUSCELINO KUBITSCHEK61
 
1.1%
BACACHERI57
 
1.1%
SANTOS DUMONT54
 
1.0%
AEROPORTO ESTADUAL DE JUNDIAÍ53
 
1.0%
CARLOS DRUMMOND DE ANDRADE / PAMPULHA49
 
0.9%
Other values (664)2610
48.2%
(Missing)196
 
3.6%

Length

2022-05-29T01:55:24.199115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
de2629
 
18.1%
aerodromo1991
 
13.7%
fora1990
 
13.7%
200
 
1.4%
governador171
 
1.2%
campo158
 
1.1%
fazenda138
 
1.0%
carlos131
 
0.9%
estadual123
 
0.8%
marte114
 
0.8%
Other values (925)6877
47.4%

Most occurring characters

ValueCountFrequency (%)
O13787
14.9%
A11075
12.0%
R10845
11.7%
9321
10.1%
E8997
9.7%
D6843
 
7.4%
M3654
 
4.0%
N3216
 
3.5%
I3078
 
3.3%
S2870
 
3.1%
Other values (64)18794
20.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter82690
89.4%
Space Separator9321
 
10.1%
Other Punctuation207
 
0.2%
Dash Punctuation117
 
0.1%
Lowercase Letter84
 
0.1%
Decimal Number47
 
0.1%
Open Punctuation7
 
< 0.1%
Close Punctuation7
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O13787
16.7%
A11075
13.4%
R10845
13.1%
E8997
10.9%
D6843
8.3%
M3654
 
4.4%
N3216
 
3.9%
I3078
 
3.7%
S2870
 
3.5%
F2584
 
3.1%
Other values (27)15741
19.0%
Lowercase Letter
ValueCountFrequency (%)
o12
14.3%
a9
10.7%
r8
9.5%
i7
8.3%
s7
8.3%
e7
8.3%
c6
 
7.1%
t4
 
4.8%
u4
 
4.8%
d3
 
3.6%
Other values (10)17
20.2%
Decimal Number
ValueCountFrequency (%)
117
36.2%
416
34.0%
34
 
8.5%
53
 
6.4%
62
 
4.3%
02
 
4.3%
72
 
4.3%
21
 
2.1%
Other Punctuation
ValueCountFrequency (%)
/187
90.3%
*7
 
3.4%
'7
 
3.4%
.6
 
2.9%
Dash Punctuation
ValueCountFrequency (%)
-116
99.1%
1
 
0.9%
Space Separator
ValueCountFrequency (%)
9321
100.0%
Open Punctuation
ValueCountFrequency (%)
(7
100.0%
Close Punctuation
ValueCountFrequency (%)
)7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin82774
89.5%
Common9706
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
O13787
16.7%
A11075
13.4%
R10845
13.1%
E8997
10.9%
D6843
8.3%
M3654
 
4.4%
N3216
 
3.9%
I3078
 
3.7%
S2870
 
3.5%
F2584
 
3.1%
Other values (47)15825
19.1%
Common
ValueCountFrequency (%)
9321
96.0%
/187
 
1.9%
-116
 
1.2%
117
 
0.2%
416
 
0.2%
(7
 
0.1%
*7
 
0.1%
)7
 
0.1%
'7
 
0.1%
.6
 
0.1%
Other values (7)15
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII91092
98.5%
None1387
 
1.5%
Punctuation1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O13787
15.1%
A11075
12.2%
R10845
11.9%
9321
10.2%
E8997
9.9%
D6843
 
7.5%
M3654
 
4.0%
N3216
 
3.5%
I3078
 
3.4%
S2870
 
3.2%
Other values (49)17406
19.1%
None
ValueCountFrequency (%)
Ã317
22.9%
É280
20.2%
Í212
15.3%
Á193
13.9%
Ó117
 
8.4%
Ç89
 
6.4%
Â75
 
5.4%
Ú60
 
4.3%
Ô26
 
1.9%
Ê11
 
0.8%
Other values (4)7
 
0.5%
Punctuation
ValueCountFrequency (%)
1
100.0%

aeronave_fase_operacao
Categorical

HIGH CORRELATION

Distinct31
Distinct (%)0.6%
Missing26
Missing (%)0.5%
Memory size42.4 KiB
POUSO
917 
DECOLAGEM
883 
CRUZEIRO
831 
CORRIDA APÓS POUSO
616 
TÁXI
390 
Other values (26)
1754 

Length

Max length31
Median length21
Mean length9.736783528
Min length4

Characters and Unicode

Total characters52491
Distinct characters30
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowPOUSO
2nd rowDESCIDA
3rd rowESPECIALIZADA
4th rowCRUZEIRO
5th rowPOUSO

Common Values

ValueCountFrequency (%)
POUSO917
16.9%
DECOLAGEM883
16.3%
CRUZEIRO831
15.3%
CORRIDA APÓS POUSO616
11.4%
TÁXI390
7.2%
SUBIDA378
7.0%
APROXIMAÇÃO FINAL292
 
5.4%
MANOBRA235
 
4.3%
ESPECIALIZADA139
 
2.6%
DESCIDA138
 
2.5%
Other values (21)572
10.6%

Length

2022-05-29T01:55:24.305789image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
pouso1533
19.9%
decolagem896
11.6%
cruzeiro831
10.8%
corrida616
 
8.0%
após616
 
8.0%
táxi390
 
5.1%
subida378
 
4.9%
aproximação309
 
4.0%
final303
 
3.9%
manobra235
 
3.1%
Other values (39)1590
20.7%

Most occurring characters

ValueCountFrequency (%)
O7171
13.7%
A5488
10.5%
R4183
 
8.0%
I3937
 
7.5%
E3923
 
7.5%
S3010
 
5.7%
U2974
 
5.7%
C2961
 
5.6%
D2810
 
5.4%
P2686
 
5.1%
Other values (20)13348
25.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter50185
95.6%
Space Separator2306
 
4.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O7171
14.3%
A5488
10.9%
R4183
 
8.3%
I3937
 
7.8%
E3923
 
7.8%
S3010
 
6.0%
U2974
 
5.9%
C2961
 
5.9%
D2810
 
5.6%
P2686
 
5.4%
Other values (19)11042
22.0%
Space Separator
ValueCountFrequency (%)
2306
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin50185
95.6%
Common2306
 
4.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
O7171
14.3%
A5488
10.9%
R4183
 
8.3%
I3937
 
7.8%
E3923
 
7.8%
S3010
 
6.0%
U2974
 
5.9%
C2961
 
5.9%
D2810
 
5.6%
P2686
 
5.4%
Other values (19)11042
22.0%
Common
ValueCountFrequency (%)
2306
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII50696
96.6%
None1795
 
3.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O7171
14.1%
A5488
10.8%
R4183
 
8.3%
I3937
 
7.8%
E3923
 
7.7%
S3010
 
5.9%
U2974
 
5.9%
C2961
 
5.8%
D2810
 
5.5%
P2686
 
5.3%
Other values (14)11553
22.8%
None
ValueCountFrequency (%)
Ó616
34.3%
Á497
27.7%
Ç335
18.7%
Ã334
18.6%
Í10
 
0.6%
Ê3
 
0.2%

aeronave_tipo_operacao
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct9
Distinct (%)0.2%
Missing148
Missing (%)2.7%
Memory size42.4 KiB
PRIVADA
1867 
REGULAR
1059 
INSTRUÇÃO
702 
TÁXI AÉREO
617 
AGRÍCOLA
514 
Other values (4)
510 

Length

Max length13
Median length7
Mean length8.086164358
Min length7

Characters and Unicode

Total characters42606
Distinct characters24
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRIVADA
2nd rowREGULAR
3rd rowAGRÍCOLA
4th rowEXPERIMENTAL
5th rowAGRÍCOLA

Common Values

ValueCountFrequency (%)
PRIVADA1867
34.5%
REGULAR1059
19.5%
INSTRUÇÃO702
 
13.0%
TÁXI AÉREO617
 
11.4%
AGRÍCOLA514
 
9.5%
EXPERIMENTAL238
 
4.4%
POLICIAL148
 
2.7%
NÃO REGULAR64
 
1.2%
ESPECIALIZADA60
 
1.1%
(Missing)148
 
2.7%

Length

2022-05-29T01:55:24.398494image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-29T01:55:24.503533image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
privada1867
31.4%
regular1123
18.9%
instrução702
 
11.8%
táxi617
 
10.4%
aéreo617
 
10.4%
agrícola514
 
8.6%
experimental238
 
4.0%
policial148
 
2.5%
não64
 
1.1%
especializada60
 
1.0%

Most occurring characters

ValueCountFrequency (%)
A7068
16.6%
R6184
14.5%
I3840
 
9.0%
E2574
 
6.0%
P2313
 
5.4%
L2231
 
5.2%
O2045
 
4.8%
D1927
 
4.5%
V1867
 
4.4%
U1825
 
4.3%
Other values (14)10732
25.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter41925
98.4%
Space Separator681
 
1.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A7068
16.9%
R6184
14.8%
I3840
 
9.2%
E2574
 
6.1%
P2313
 
5.5%
L2231
 
5.3%
O2045
 
4.9%
D1927
 
4.6%
V1867
 
4.5%
U1825
 
4.4%
Other values (13)10051
24.0%
Space Separator
ValueCountFrequency (%)
681
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin41925
98.4%
Common681
 
1.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
A7068
16.9%
R6184
14.8%
I3840
 
9.2%
E2574
 
6.1%
P2313
 
5.5%
L2231
 
5.3%
O2045
 
4.9%
D1927
 
4.6%
V1867
 
4.5%
U1825
 
4.4%
Other values (13)10051
24.0%
Common
ValueCountFrequency (%)
681
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII39390
92.5%
None3216
 
7.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A7068
17.9%
R6184
15.7%
I3840
9.7%
E2574
 
6.5%
P2313
 
5.9%
L2231
 
5.7%
O2045
 
5.2%
D1927
 
4.9%
V1867
 
4.7%
U1825
 
4.6%
Other values (9)7516
19.1%
None
ValueCountFrequency (%)
Ã766
23.8%
Ç702
21.8%
Á617
19.2%
É617
19.2%
Í514
16.0%

aeronave_nivel_dano
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)0.1%
Missing49
Missing (%)0.9%
Memory size42.4 KiB
NENHUM
1898 
SUBSTANCIAL
1601 
LEVE
1554 
DESTRUÍDA
315 

Length

Max length11
Median length9
Mean length7.088301043
Min length4

Characters and Unicode

Total characters38050
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLEVE
2nd rowNENHUM
3rd rowSUBSTANCIAL
4th rowLEVE
5th rowSUBSTANCIAL

Common Values

ValueCountFrequency (%)
NENHUM1898
35.0%
SUBSTANCIAL1601
29.6%
LEVE1554
28.7%
DESTRUÍDA315
 
5.8%
(Missing)49
 
0.9%

Length

2022-05-29T01:55:24.637928image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-29T01:55:24.734864image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
nenhum1898
35.4%
substancial1601
29.8%
leve1554
28.9%
destruída315
 
5.9%

Most occurring characters

ValueCountFrequency (%)
N5397
14.2%
E5321
14.0%
U3814
10.0%
S3517
9.2%
A3517
9.2%
L3155
8.3%
T1916
 
5.0%
H1898
 
5.0%
M1898
 
5.0%
B1601
 
4.2%
Other values (6)6016
15.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter38050
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N5397
14.2%
E5321
14.0%
U3814
10.0%
S3517
9.2%
A3517
9.2%
L3155
8.3%
T1916
 
5.0%
H1898
 
5.0%
M1898
 
5.0%
B1601
 
4.2%
Other values (6)6016
15.8%

Most occurring scripts

ValueCountFrequency (%)
Latin38050
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N5397
14.2%
E5321
14.0%
U3814
10.0%
S3517
9.2%
A3517
9.2%
L3155
8.3%
T1916
 
5.0%
H1898
 
5.0%
M1898
 
5.0%
B1601
 
4.2%
Other values (6)6016
15.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII37735
99.2%
None315
 
0.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N5397
14.3%
E5321
14.1%
U3814
10.1%
S3517
9.3%
A3517
9.3%
L3155
8.4%
T1916
 
5.1%
H1898
 
5.0%
M1898
 
5.0%
B1601
 
4.2%
Other values (5)5701
15.1%
None
ValueCountFrequency (%)
Í315
100.0%

aeronave_fatalidades_total
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct10
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1456525752
Minimum0
Maximum10
Zeros5015
Zeros (%)92.6%
Negative0
Negative (%)0.0%
Memory size42.4 KiB
2022-05-29T01:55:24.812963image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum10
Range10
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6474127748
Coefficient of variation (CV)4.444911282
Kurtosis52.5966278
Mean0.1456525752
Median Absolute Deviation (MAD)0
Skewness6.433938698
Sum789
Variance0.419143301
MonotonicityNot monotonic
2022-05-29T01:55:24.892877image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
05015
92.6%
1209
 
3.9%
2112
 
2.1%
325
 
0.5%
424
 
0.4%
520
 
0.4%
65
 
0.1%
83
 
0.1%
73
 
0.1%
101
 
< 0.1%
ValueCountFrequency (%)
05015
92.6%
1209
 
3.9%
2112
 
2.1%
325
 
0.5%
424
 
0.4%
520
 
0.4%
65
 
0.1%
73
 
0.1%
83
 
0.1%
101
 
< 0.1%
ValueCountFrequency (%)
101
 
< 0.1%
83
 
0.1%
73
 
0.1%
65
 
0.1%
520
 
0.4%
424
 
0.4%
325
 
0.5%
2112
 
2.1%
1209
 
3.9%
05015
92.6%

Interactions

2022-05-29T01:55:15.157447image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:10.739599image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:11.517198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:12.203973image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:12.920165image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:13.662462image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:14.475095image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:15.255409image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:10.844148image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:11.614994image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:12.304504image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:13.022670image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:13.758669image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:14.557906image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:15.351355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:10.942014image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:11.701865image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:12.402678image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:13.120805image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:13.960599image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:14.656081image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:15.453735image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:11.136331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:11.819022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:12.512915image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:13.231732image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:14.064516image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:14.755097image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:15.556180image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:11.244953image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:11.925907image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:12.619484image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:13.339004image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:14.178678image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:14.864014image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:15.665126image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:11.335828image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:12.023024image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:12.725707image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:13.450564image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:14.285312image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:14.965096image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:15.773934image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:11.427701image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:12.103755image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:12.819941image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:13.547889image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:14.368844image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-29T01:55:15.058300image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-05-29T01:55:24.964797image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-05-29T01:55:25.110741image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-05-29T01:55:25.251930image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-05-29T01:55:25.414352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-05-29T01:55:25.653382image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-05-29T01:55:16.020800image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-05-29T01:55:17.374132image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-05-29T01:55:18.101051image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-05-29T01:55:18.793736image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

ocorrencia_classificacaoocorrencia_cidadeocorrencia_ufocorrencia_aerodromoocorrencia_diaocorrencia_horainvestigacao_aeronave_liberadainvestigacao_statusdivulgacao_relatorio_publicadototal_recomendacoestotal_aeronaves_envolvidasocorrencia_saida_pistaocorrencia_localizacaoocorrencia_DTocorrencia_mesocorrencia_tipoocorrencia_tipo_categoriataxonomia_tipo_icaoaeronave_matriculaaeronave_operador_categoriaaeronave_tipo_veiculoaeronave_fabricanteaeronave_modeloaeronave_tipo_icaoaeronave_motor_tipoaeronave_motor_quantidadeaeronave_pmdaeronave_pmd_categoriaaeronave_assentosaeronave_ano_fabricacaoaeronave_registro_categoriaaeronave_registro_segmentoaeronave_voo_origemaeronave_voo_destinoaeronave_fase_operacaoaeronave_tipo_operacaoaeronave_nivel_danoaeronave_fatalidades_total
0INCIDENTEPORTO ALEGRE - RSRSSBPA05/01/201220:27:00NaNFINALIZADANÃO01NÃONaN2012-05-01 20:27:005.0ESTOURO DE PNEUFALHA OU MAU FUNCIONAMENTO DE SISTEMA / COMPONENTE | ESTOURO DE PNEUSCF-NPPRCDLPARTICULARAVIÃORAYTHEON AIRCRAFT58BE58PISTÃOBIMOTOR249524956.02003.0AVIÃOPARTICULARFORA DE AERODROMOFORA DE AERODROMOPOUSOPRIVADALEVE0
1ACIDENTEGUARULHOS - SPSPSBGR06/01/201213:44:00SIMFINALIZADASIM31NÃO-23.4355555556 / -46.47305555562012-06-01 13:44:006.0COM PESSOAL EM VOOOUTROS | COM PESSOAL EM VOOOTHRPRTKBNaNAVIÃOAEROSPATIALE AND ALENIAATR-42-500AT45TURBOÉLICEBIMOTOR186001860050.02001.0AVIÃOREGULARFORA DE AERODROMOFORA DE AERODROMODESCIDAREGULARNENHUM0
2ACIDENTEVIAMÃO - RSRSNaN06/01/201213:00:00NaNFINALIZADASIM01NÃONaN2012-06-01 13:00:006.0FALHA DO MOTOR EM VOOFALHA OU MAU FUNCIONAMENTO DO MOTOR | FALHA DO MOTOR EM VOOSCF-PPPTGOONaNAVIÃONEIVA INDUSTRIA AERONAUTICAEMB-201IPANPISTÃOMONOMOTOR180018001.01976.0AVIÃOESPECIALIZADAFORA DE AERODROMOFORA DE AERODROMOESPECIALIZADAAGRÍCOLASUBSTANCIAL0
3ACIDENTESÃO SEBASTIÃO - SPSPNaN06/01/201217:00:00NaNNaNNÃO01NÃONaN2012-06-01 17:00:006.0FALHA DO MOTOR EM VOOFALHA OU MAU FUNCIONAMENTO DO MOTOR | FALHA DO MOTOR EM VOOSCF-PPPUUSSEXPERIMENTALULTRALEVENaNP2004 BRAVOULACPISTÃOMONOMOTOR5805802.02007.0ULTRALEVEEXPERIMENTALFORA DE AERODROMOFORA DE AERODROMOCRUZEIROEXPERIMENTALLEVE0
4ACIDENTESÃO SEPÉ - RSRSNaN06/01/201216:30:00SIMFINALIZADASIM01NÃONaN2012-06-01 16:30:006.0PERDA DE CONTROLE NO SOLOPERDA DE CONTROLE NO SOLOLOC-GPTUCLNaNAVIÃONEIVA INDUSTRIA AERONAUTICAEMB-201AIPANPISTÃOMONOMOTOR180018001.01986.0AVIÃOESPECIALIZADAFORA DE AERODROMOFORA DE AERODROMOPOUSOAGRÍCOLASUBSTANCIAL0
5INCIDENTEUBATUBA - SPSPNaN06/01/201214:30:00NaNFINALIZADANÃO01NÃONaN2012-06-01 14:30:006.0COLISÃO COM AVECOLISÃO COM AVEBIRDPRBGFPARTICULARHELICÓPTEROEUROCOPTER FRANCEEC 120 BEC20TURBOEIXOMONOMOTOR171517155.02000.0HELICÓPTEROPARTICULARFORA DE AERODROMOFORA DE AERODROMOCRUZEIROPRIVADALEVE0
6INCIDENTE GRAVECAMPINAS - SPSPSDAI07/01/201218:15:00SIMFINALIZADASIM01NÃONaN2012-07-01 18:15:007.0PERDA DE CONTROLE NO SOLOPERDA DE CONTROLE NO SOLOLOC-GPTKYPNaNAVIÃOCIA AERONAUTICA PAULISTACAP-4PAULPISTÃOMONOMOTOR5875872.01940.0AVIÃOINSTRUÇÃOFORA DE AERODROMOFORA DE AERODROMODECOLAGEMINSTRUÇÃOSUBSTANCIAL0
7INCIDENTEBELÉM - PAPASBBE08/01/201219:12:00NaNFINALIZADANÃO01NÃONaN2012-08-01 19:12:008.0ESTOURO DE PNEUFALHA OU MAU FUNCIONAMENTO DE SISTEMA / COMPONENTE | ESTOURO DE PNEUSCF-NPPRMHXREGULARAVIÃOAIRBUS INDUSTRIEA320-214A320JATOBIMOTOR7700077000184.02008.0AVIÃOREGULARFORA DE AERODROMOFORA DE AERODROMOCORRIDA APÓS POUSOREGULARLEVE0
8ACIDENTECONCEIÇÃO DAS ALAGOAS - MGMGNaN08/01/201216:00:00NaNFINALIZADANÃO01NÃO-19.9133333333 / -48.29305555562012-08-01 16:00:008.0FALHA DO MOTOR EM VOOFALHA OU MAU FUNCIONAMENTO DO MOTOR | FALHA DO MOTOR EM VOOSCF-PPPTLUPPARTICULARAVIÃOCESSNA AIRCRAFTP210NP210PISTÃOMONOMOTOR181218126.01982.0AVIÃOPARTICULARFORA DE AERODROMOFORA DE AERODROMOCRUZEIROPRIVADASUBSTANCIAL0
9INCIDENTEUBERLÂNDIA - MGMGSBUL08/01/201222:13:00NaNFINALIZADANÃO01NÃONaN2012-08-01 22:13:008.0COM TREM DE POUSOFALHA OU MAU FUNCIONAMENTO DE SISTEMA / COMPONENTE | COM TREM DE POUSOSCF-NPPRZRBEXPERIMENTALAVIÃOJOSE ROBERTO BARBOSABUMERANGUE EX-27 CROSS-CONTRYZZZZPISTÃOMONOMOTOR126012604.02011.0AVIÃOEXPERIMENTALFORA DE AERODROMOFORA DE AERODROMOPOUSOEXPERIMENTALLEVE0

Last rows

ocorrencia_classificacaoocorrencia_cidadeocorrencia_ufocorrencia_aerodromoocorrencia_diaocorrencia_horainvestigacao_aeronave_liberadainvestigacao_statusdivulgacao_relatorio_publicadototal_recomendacoestotal_aeronaves_envolvidasocorrencia_saida_pistaocorrencia_localizacaoocorrencia_DTocorrencia_mesocorrencia_tipoocorrencia_tipo_categoriataxonomia_tipo_icaoaeronave_matriculaaeronave_operador_categoriaaeronave_tipo_veiculoaeronave_fabricanteaeronave_modeloaeronave_tipo_icaoaeronave_motor_tipoaeronave_motor_quantidadeaeronave_pmdaeronave_pmd_categoriaaeronave_assentosaeronave_ano_fabricacaoaeronave_registro_categoriaaeronave_registro_segmentoaeronave_voo_origemaeronave_voo_destinoaeronave_fase_operacaoaeronave_tipo_operacaoaeronave_nivel_danoaeronave_fatalidades_total
5407INCIDENTEMANAUS - AMAMSBEG30/12/202114:41:00SIMFINALIZADANÃO01NÃO-3.04111111 / -60.050555562021-12-30 14:41:0012.0ESTOURO DE PNEUFALHA OU MAU FUNCIONAMENTO DE SISTEMA / COMPONENTE | ESTOURO DE PNEUSCF-NPPTOCVNaNAVIÃOEMBRAEREMB-110P1E110TURBOÉLICEBIMOTOR5670567021.01981.0AVIÃOTÁXI AÉREOCARAUARIEDUARDO GOMESCORRIDA APÓS POUSOTÁXI AÉREOLEVE0
5408INCIDENTESÃO PAULO - SPSPSBSP30/12/202113:15:00SIMFINALIZADANÃO01SIM-23.626111 / -46.6563892021-12-30 13:15:0012.0COM TREM DE POUSOFALHA OU MAU FUNCIONAMENTO DE SISTEMA / COMPONENTE | COM TREM DE POUSOSCF-NPPPAFPNaNAVIÃOCESSNA AIRCRAFT208BC208TURBOÉLICEMONOMOTOR396939697.02013.0AVIÃOPARTICULARCONGONHASDOUTOR RAMALHO FRANCOTÁXIPRIVADANENHUM0
5409INCIDENTESÃO PAULO - SPSPSBSP30/12/202113:15:00SIMFINALIZADANÃO01SIM-23.626111 / -46.6563892021-12-30 13:15:0012.0EXCURSÃO DE PISTAEXCURSÃO DE PISTAREPPAFPNaNAVIÃOCESSNA AIRCRAFT208BC208TURBOÉLICEMONOMOTOR396939697.02013.0AVIÃOPARTICULARCONGONHASDOUTOR RAMALHO FRANCOTÁXIPRIVADANENHUM0
5410ACIDENTEJATAÍ - GOGONaN30/12/202120:30:00SIMATIVANÃO01NÃO-17.999194 / -51.6428612021-12-30 20:30:0012.0EXCURSÃO DE PISTAEXCURSÃO DE PISTAREPTWBANaNAVIÃOEMBRAEREMB-202AIPANPISTÃOMONOMOTOR180018001.02013.0AVIÃOPARTICULARPISTA DE POUSO EVENTUALPISTA DE POUSO EVENTUALDECOLAGEMAGRÍCOLASUBSTANCIAL0
5411ACIDENTEJATAÍ - GOGONaN30/12/202120:30:00SIMATIVANÃO01NÃO-17.999194 / -51.6428612021-12-30 20:30:0012.0PERDA DE CONTROLE NO SOLOPERDA DE CONTROLE NO SOLOLOC-GPTWBANaNAVIÃOEMBRAEREMB-202AIPANPISTÃOMONOMOTOR180018001.02013.0AVIÃOPARTICULARPISTA DE POUSO EVENTUALPISTA DE POUSO EVENTUALDECOLAGEMAGRÍCOLASUBSTANCIAL0
5412ACIDENTEMARACAÍ - SPSPNaN31/12/202109:30:00SIMATIVANÃO01NÃO-22.585556 / -50.7538892021-12-31 09:30:0012.0OPERAÇÃO A BAIXA ALTITUDEOPERAÇÃO A BAIXA ALTITUDELALTPRVPRNaNAVIÃOAIR TRACTORAT-502BAT5TTURBOÉLICEMONOMOTOR362936291.02018.0AVIÃOAGRÍCOLANaNNaNMANOBRAESPECIALIZADASUBSTANCIAL0
5413ACIDENTEMARACAÍ - SPSPNaN31/12/202109:30:00SIMATIVANÃO01NÃO-22.585556 / -50.7538892021-12-31 09:30:0012.0PERDA DE CONTROLE EM VOOPERDA DE CONTROLE EM VOOLOC-IPRVPRNaNAVIÃOAIR TRACTORAT-502BAT5TTURBOÉLICEMONOMOTOR362936291.02018.0AVIÃOAGRÍCOLANaNNaNMANOBRAESPECIALIZADASUBSTANCIAL0
5414INCIDENTE GRAVENOVO HAMBURGO - RSRSSSNH31/12/202111:59:00SIMFINALIZADANÃO01NÃO-29.695833 / -51.0816672021-12-31 11:59:0012.0POUSO BRUSCOCONTATO ANORMAL COM A PISTA | POUSO BRUSCOARCPPFLYNaNAVIÃOAERO BOEROAB-115AB11PISTÃOMONOMOTOR7707702.01990.0AVIÃOINSTRUÇÃONOVO HAMBURGONOVO HAMBURGOPOUSOINSTRUÇÃOLEVE0
5415INCIDENTECURITIBA - PRPRSBBI31/12/202115:12:00SIMFINALIZADANÃO01NÃO-25.403333 / -49.2336112021-12-31 15:12:0012.0COLISÃO COM OBSTÁCULOS NO SOLOCOLISÃO NO SOLO | COLISÃO COM OBSTÁCULOS NO SOLOGCOLPTWSANaNAVIÃOBEECH AIRCRAFT58BE58PISTÃOBIMOTOR244924496.01972.0AVIÃOADMINISTRAÇÃO DIRETASANTOS DUMONTBACACHERITÁXIPOLICIALLEVE0
5416INCIDENTEPETROLINA - PEPESBPL31/12/202120:30:00SIMFINALIZADANÃO01NÃO-9.3675 / -40.563611111112021-12-31 20:30:0012.0FALHA DO MOTOR EM VOOFALHA OU MAU FUNCIONAMENTO DO MOTOR | FALHA DO MOTOR EM VOOSCF-PPPRGXMNaNAVIÃOBOEING COMPANY737-8EHB738JATOBIMOTOR7053370533199.02013.0AVIÃOREGULARORLANDO BEZERRA DE MENEZESGOVERNADOR ANDRÉ FRANCO MONTOROSUBIDAREGULARLEVE0